* Re: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
@ 2006-01-20 2:18 ` Jesse Barnes
2006-01-20 6:47 ` Brent Casavant
2006-01-20 13:26 ` Jack Steiner
2006-01-20 8:36 ` Ingo Molnar
` (63 subsequent siblings)
64 siblings, 2 replies; 73+ messages in thread
From: Jesse Barnes @ 2006-01-20 2:18 UTC (permalink / raw)
To: Brent Casavant; +Cc: linux-ia64, linux-kernel, jes, tony.luck
On Thursday, January 19, 2006 4:06 pm, Brent Casavant wrote:
> #ifndef __ARCH_WANT_UNLOCKED_CTXSW
> static inline int task_running(runqueue_t *rq, task_t *p)
> @@ -936,6 +939,7 @@ static int migrate_task(task_t *p, int d
> * it is sufficient to simply update the task's cpu field.
> */
> if (!p->array && !task_running(rq, p)) {
> + arch_task_migrate(p);
> set_task_cpu(p, dest_cpu);
> return 0;
> }
> @@ -1353,6 +1357,7 @@ static int try_to_wake_up(task_t *p, uns
> out_set_cpu:
> new_cpu = wake_idle(new_cpu, p);
> if (new_cpu != cpu) {
> + arch_task_migrate(p);
> set_task_cpu(p, new_cpu);
> task_rq_unlock(rq, &flags);
> /* might preempt at this point */
> @@ -1876,6 +1881,7 @@ void pull_task(runqueue_t *src_rq, prio_
> {
> dequeue_task(p, src_array);
> dec_nr_running(p, src_rq);
> + arch_task_migrate(p);
> set_task_cpu(p, this_cpu);
> inc_nr_running(p, this_rq);
> enqueue_task(p, this_array);
> @@ -4547,6 +4553,7 @@ static void __migrate_task(struct task_s
> if (!cpu_isset(dest_cpu, p->cpus_allowed))
> goto out;
>
> + arch_task_migrate(p);
> set_task_cpu(p, dest_cpu);
> if (p->array) {
> /*
Maybe you could just turn the above into mmiowb() calls instead? That
would cover altix, origin, and ppc as well I think. On other platforms
it would be a complete no-op.
Jesse
^ permalink raw reply [flat|nested] 73+ messages in thread* Re: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 2:18 ` Jesse Barnes
@ 2006-01-20 6:47 ` Brent Casavant
2006-01-20 17:36 ` Jesse Barnes
2006-01-20 13:26 ` Jack Steiner
1 sibling, 1 reply; 73+ messages in thread
From: Brent Casavant @ 2006-01-20 6:47 UTC (permalink / raw)
To: Jesse Barnes; +Cc: linux-ia64, linux-kernel, jes, tony.luck
On Thu, 19 Jan 2006, Jesse Barnes wrote:
> Maybe you could just turn the above into mmiowb() calls instead? That
> would cover altix, origin, and ppc as well I think. On other platforms
> it would be a complete no-op.
As you obviously noted, the core of the code was lifted from mmiowb().
But no, an mmiowb() as such isn't correct. At the time this code is
executing, it's on a CPU remote from the one which issued any PIO writes
to the device. So in this case we need to poll the Shub register for
a remote node, but mmiowb() only polls for the Shub corresponding to
the current CPU.
My first incarnation of this patch (never publicly presented) did
implement a new mmiowb_remote(cpu) machvec instead, and this was
placed in the context-switch (in) path instead of the task migration
path. However, since this behavior is only needed for the task
migration case, Jack Steiner pointed out that this was a more
appropriate way to implement it. As migration is much less frequent
than context switching, this is a better-performing method to solve
the problem.
Thanks,
Brent
--
Brent Casavant All music is folk music. I ain't
bcasavan@sgi.com never heard a horse sing a song.
Silicon Graphics, Inc. -- Louis Armstrong
^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 6:47 ` Brent Casavant
@ 2006-01-20 17:36 ` Jesse Barnes
2006-01-20 20:01 ` Brent Casavant
0 siblings, 1 reply; 73+ messages in thread
From: Jesse Barnes @ 2006-01-20 17:36 UTC (permalink / raw)
To: Brent Casavant; +Cc: linux-ia64, linux-kernel, jes, tony.luck
On Thursday, January 19, 2006 10:47 pm, Brent Casavant wrote:
> On Thu, 19 Jan 2006, Jesse Barnes wrote:
> > Maybe you could just turn the above into mmiowb() calls instead?
> > That would cover altix, origin, and ppc as well I think. On other
> > platforms it would be a complete no-op.
>
> As you obviously noted, the core of the code was lifted from mmiowb().
> But no, an mmiowb() as such isn't correct. At the time this code is
> executing, it's on a CPU remote from the one which issued any PIO
> writes to the device. So in this case we need to poll the Shub
> register for a remote node, but mmiowb() only polls for the Shub
> corresponding to the current CPU.
Ah, ok. It sounds like Ingo might have a better place to put it anyway.
(I was thinking this was on the switch out path on the CPU where the
task last ran, didn't look at it in detail.)
Of course, the other option is just to require tasks that do MMIO
accesses from userspace to be pinned to particular CPU or node. :)
Thanks,
Jesse
^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 17:36 ` Jesse Barnes
@ 2006-01-20 20:01 ` Brent Casavant
0 siblings, 0 replies; 73+ messages in thread
From: Brent Casavant @ 2006-01-20 20:01 UTC (permalink / raw)
To: Jesse Barnes; +Cc: linux-ia64, linux-kernel, jes, tony.luck
On Fri, 20 Jan 2006, Jesse Barnes wrote:
> Of course, the other option is just to require tasks that do MMIO
> accesses from userspace to be pinned to particular CPU or node. :)
One idea I had was to add a counter into the mm struct that gets
bumped if the process performs any MMIO mappings, so that only
affected processes pay the penalty. However, the added complexity
in the drivers (e.g. handling partial unmaps, etc.) doesn't seem worth
it. On average this code adds 800ns to the task migration path, which
is relatively infrequent and already a bit expensive (what with cold
caches and the like).
Regarding the direction Ingo sent me down, and considering what Jack
said about needing a hook for a future platform, I'm thinking of grabbing
a bit in task->thread.flags that IA64_HAS_EXTRA_STATE() could detect and
let ia64_{save,load}_extra() call new machvecs to perform this
chipset-specific context management. It's a bit overengineered for
my particular case, but would allow Jack to plug in his work very
cleanly.
Brent
--
Brent Casavant All music is folk music. I ain't
bcasavan@sgi.com never heard a horse sing a song.
Silicon Graphics, Inc. -- Louis Armstrong
^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 2:18 ` Jesse Barnes
2006-01-20 6:47 ` Brent Casavant
@ 2006-01-20 13:26 ` Jack Steiner
2006-01-20 17:31 ` Jesse Barnes
1 sibling, 1 reply; 73+ messages in thread
From: Jack Steiner @ 2006-01-20 13:26 UTC (permalink / raw)
To: Jesse Barnes; +Cc: Brent Casavant, linux-ia64, linux-kernel, jes, tony.luck
On Thu, Jan 19, 2006 at 06:18:43PM -0800, Jesse Barnes wrote:
> On Thursday, January 19, 2006 4:06 pm, Brent Casavant wrote:
> > #ifndef __ARCH_WANT_UNLOCKED_CTXSW
> > static inline int task_running(runqueue_t *rq, task_t *p)
> > @@ -936,6 +939,7 @@ static int migrate_task(task_t *p, int d
> > * it is sufficient to simply update the task's cpu field.
> > */
> > if (!p->array && !task_running(rq, p)) {
> > + arch_task_migrate(p);
> > set_task_cpu(p, dest_cpu);
> > return 0;
> > }
> > @@ -1353,6 +1357,7 @@ static int try_to_wake_up(task_t *p, uns
> > out_set_cpu:
> > new_cpu = wake_idle(new_cpu, p);
> > if (new_cpu != cpu) {
> > + arch_task_migrate(p);
> > set_task_cpu(p, new_cpu);
> > task_rq_unlock(rq, &flags);
> > /* might preempt at this point */
> > @@ -1876,6 +1881,7 @@ void pull_task(runqueue_t *src_rq, prio_
> > {
> > dequeue_task(p, src_array);
> > dec_nr_running(p, src_rq);
> > + arch_task_migrate(p);
> > set_task_cpu(p, this_cpu);
> > inc_nr_running(p, this_rq);
> > enqueue_task(p, this_array);
> > @@ -4547,6 +4553,7 @@ static void __migrate_task(struct task_s
> > if (!cpu_isset(dest_cpu, p->cpus_allowed))
> > goto out;
> >
> > + arch_task_migrate(p);
> > set_task_cpu(p, dest_cpu);
> > if (p->array) {
> > /*
>
> Maybe you could just turn the above into mmiowb() calls instead? That
> would cover altix, origin, and ppc as well I think. On other platforms
> it would be a complete no-op.
>
> Jesse
I don't think calling mmiob() directly would work. In order to make
CONFIG_IA64_GENERIC work, the call to mmiob() needs to be underneath a
platform vector. Using ia64_platform_is() would also work but I think
a platform vector is cleaner.
A second reason for an arch_task_migrate() instead of a specific mmiob() is
to provide a hook for a future platform that require additional work
to be done when a task migrates.
--
Thanks
Jack Steiner (steiner@sgi.com) 651-683-5302
^ permalink raw reply [flat|nested] 73+ messages in thread* Re: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 13:26 ` Jack Steiner
@ 2006-01-20 17:31 ` Jesse Barnes
2006-01-20 19:00 ` Jack Steiner
0 siblings, 1 reply; 73+ messages in thread
From: Jesse Barnes @ 2006-01-20 17:31 UTC (permalink / raw)
To: Jack Steiner; +Cc: Brent Casavant, linux-ia64, linux-kernel, jes, tony.luck
On Friday, January 20, 2006 5:26 am, Jack Steiner wrote:
> I don't think calling mmiob() directly would work. In order to make
> CONFIG_IA64_GENERIC work, the call to mmiob() needs to be underneath a
> platform vector. Using ia64_platform_is() would also work but I think
> a platform vector is cleaner.
mmiowb is already a platform vector on ia64, so I think you're ok there.
> A second reason for an arch_task_migrate() instead of a specific
> mmiob() is to provide a hook for a future platform that require
> additional work to be done when a task migrates.
What does the new platform require (just curious)?
Jesse
^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 17:31 ` Jesse Barnes
@ 2006-01-20 19:00 ` Jack Steiner
0 siblings, 0 replies; 73+ messages in thread
From: Jack Steiner @ 2006-01-20 19:00 UTC (permalink / raw)
To: Jesse Barnes; +Cc: Brent Casavant, linux-ia64, linux-kernel, jes, tony.luck
> > A second reason for an arch_task_migrate() instead of a specific
> > mmiob() is to provide a hook for a future platform that require
> > additional work to be done when a task migrates.
>
> What does the new platform require (just curious)?
>
> Jesse
Sorry, can't say. Rejoin SGI & I'll tell you :-)
---
Jack
^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
2006-01-20 2:18 ` Jesse Barnes
@ 2006-01-20 8:36 ` Ingo Molnar
2006-01-20 16:14 ` Brent Casavant
2006-01-24 0:33 ` Brent Casavant
` (62 subsequent siblings)
64 siblings, 1 reply; 73+ messages in thread
From: Ingo Molnar @ 2006-01-20 8:36 UTC (permalink / raw)
To: Brent Casavant; +Cc: linux-ia64, linux-kernel, jes, tony.luck
* Brent Casavant <bcasavan@sgi.com> wrote:
> --- a/kernel/sched.c
> +++ b/kernel/sched.c
> @@ -291,6 +291,9 @@ for (domain = rcu_dereference(cpu_rq(cpu
> #ifndef finish_arch_switch
> # define finish_arch_switch(prev) do { } while (0)
> #endif
> +#ifndef arch_task_migrate
> +# define arch_task_migrate(task) do { } while (0)
> +#endif
> if (!p->array && !task_running(rq, p)) {
> + arch_task_migrate(p);
> set_task_cpu(p, dest_cpu);
> if (new_cpu != cpu) {
> + arch_task_migrate(p);
> set_task_cpu(p, new_cpu);
> dec_nr_running(p, src_rq);
> + arch_task_migrate(p);
> set_task_cpu(p, this_cpu);
> + arch_task_migrate(p);
> set_task_cpu(p, dest_cpu);
hm, why isnt the synchronization done in switch_to()? Your arch-level
switch_to() could have something like thread->last_cpu_sync, and if
thread->last_cpu_sync != this_cpu, do the flush. This would not only
keep this stuff out of the generic scheduler, but it would also optimize
things a bit more: the moment we do a set_task_cpu() it does not mean
that CPU _will_ run the task. Another CPU could grab that task later on.
So we should delay such IO-synchronization to the last possible moment:
when we know that we've hit a new CPU on which we havent done a flush
yet. For same-CPU context switches there wouldnt be any extra
synchronization, because thread->last_cpu_sync = this_cpu.
Ingo
^ permalink raw reply [flat|nested] 73+ messages in thread* Re: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 8:36 ` Ingo Molnar
@ 2006-01-20 16:14 ` Brent Casavant
0 siblings, 0 replies; 73+ messages in thread
From: Brent Casavant @ 2006-01-20 16:14 UTC (permalink / raw)
To: Ingo Molnar; +Cc: linux-ia64, linux-kernel, jes, tony.luck
On Fri, 20 Jan 2006, Ingo Molnar wrote:
> hm, why isnt the synchronization done in switch_to()? Your arch-level
> switch_to() could have something like thread->last_cpu_sync, and if
> thread->last_cpu_sync != this_cpu, do the flush. This would not only
> keep this stuff out of the generic scheduler, but it would also optimize
> things a bit more: the moment we do a set_task_cpu() it does not mean
> that CPU _will_ run the task. Another CPU could grab that task later on.
Very good points all around. I'll rework the changes in just the
manner you mentioned.
Brent
--
Brent Casavant All music is folk music. I ain't
bcasavan@sgi.com never heard a horse sing a song.
Silicon Graphics, Inc. -- Louis Armstrong
^ permalink raw reply [flat|nested] 73+ messages in thread
* [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
2006-01-20 2:18 ` Jesse Barnes
2006-01-20 8:36 ` Ingo Molnar
@ 2006-01-24 0:33 ` Brent Casavant
2006-01-24 0:48 ` Luck, Tony
` (61 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Brent Casavant @ 2006-01-24 0:33 UTC (permalink / raw)
To: linux-ia64
Take 2.
On SN2, MMIO writes which are issued from separate processors are not
guaranteed to arrive in any particular order at the IO hardware. When
performing such writes from the kernel this is not a problem, as a
kernel thread will not migrate to another CPU during execution, and
mmiowb() calls can guarantee write ordering when control of the IO
resource is allowed to move between threads.
However, when MMIO writes can be performed from user space (e.g. DRM)
there are no such guarantees and mechanisms, as the process may
context-switch at any time, and may migrate to a different CPU as part
of the switch. For such programs/hardware to operate correctly, it is
required that the MMIO writes from the old CPU be accepted by the IO
hardware before subsequent writes from the new CPU can be issued.
The following patch implements this behavior on SN2 by waiting for a
Shub register to indicate that these writes have been accepted (the
non-SN2 case is a no-op). This is placed in the context switch-in
path, and only performs the wait when the newly scheduled task changes
CPUs.
Signed-off-by: Brent Casavant <bcasavan@sgi.com>
arch/ia64/sn/kernel/setup.c | 8 +++++---
arch/ia64/sn/kernel/sn2/sn2_smp.c | 33 ++++++++++++++++++++++++++++++++-
include/asm-ia64/machvec.h | 12 ++++++++++++
include/asm-ia64/machvec_sn2.h | 4 +++-
include/asm-ia64/system.h | 1 +
include/asm-ia64/thread_info.h | 10 ++++++++++
6 files changed, 63 insertions(+), 5 deletions(-)
diff --git a/arch/ia64/sn/kernel/setup.c b/arch/ia64/sn/kernel/setup.c
index e510dce..9a52dfa 100644
--- a/arch/ia64/sn/kernel/setup.c
+++ b/arch/ia64/sn/kernel/setup.c
@@ -3,7 +3,7 @@
* License. See the file "COPYING" in the main directory of this archive
* for more details.
*
- * Copyright (C) 1999,2001-2005 Silicon Graphics, Inc. All rights reserved.
+ * Copyright (C) 1999,2001-2006 Silicon Graphics, Inc. All rights reserved.
*/
#include <linux/config.h>
@@ -654,8 +654,10 @@ void __init sn_cpu_init(void)
SH2_PIO_WRITE_STATUS_1, SH2_PIO_WRITE_STATUS_3};
u64 *pio;
pio = is_shub1() ? pio1 : pio2;
- pda->pio_write_status_addr = (volatile unsigned long *) LOCAL_MMR_ADDR(pio[slice]);
- pda->pio_write_status_val = is_shub1() ? SH_PIO_WRITE_STATUS_PENDING_WRITE_COUNT_MASK : 0;
+ pda->pio_write_status_addr = (volatile unsigned long *)
+ GLOBAL_MMR_ADDR(nasid, pio[slice]);
+ pda->pio_write_status_val = is_shub1() ?
+ SH_PIO_WRITE_STATUS_PENDING_WRITE_COUNT_MASK : 0;
}
/*
diff --git a/arch/ia64/sn/kernel/sn2/sn2_smp.c b/arch/ia64/sn/kernel/sn2/sn2_smp.c
index 471bbaa..1cca3f2 100644
--- a/arch/ia64/sn/kernel/sn2/sn2_smp.c
+++ b/arch/ia64/sn/kernel/sn2/sn2_smp.c
@@ -5,7 +5,7 @@
* License. See the file "COPYING" in the main directory of this archive
* for more details.
*
- * Copyright (C) 2000-2005 Silicon Graphics, Inc. All rights reserved.
+ * Copyright (C) 2000-2006 Silicon Graphics, Inc. All rights reserved.
*/
#include <linux/init.h>
@@ -169,6 +169,37 @@ static inline unsigned long wait_piowc(v
return ws;
}
+/**
+ * sn_switch_from - SN-specific migrate-from actions
+ * @task: Task being switched to new CPU
+ *
+ * SN2 PIO writes from separate CPUs are not guaranteed to arrive in order.
+ * Context switching user threads which have memory-mapped MMIO may cause
+ * PIOs to issue from seperate CPUs, thus the PIO writes must be drained
+ * from the previous CPU's Shub before execution resumes on the new CPU.
+ */
+void sn_switch_from(struct task_struct *task)
+{
+ __u32 *piowr_cpu = &task_thread_info(task)->sn2_piowr_cpu;
+ __u32 this_cpu = task_cpu(task);
+ pda_t *piowr_pda;
+ volatile unsigned long *adr;
+ unsigned long val;
+
+ if (likely(this_cpu = *piowr_cpu))
+ return;
+
+ /* Drain PIO writes from old CPU's Shub */
+ piowr_pda = pdacpu(*piowr_cpu);
+ adr = piowr_pda->pio_write_status_addr;
+ val = piowr_pda->pio_write_status_val;
+ while ((*adr & SH_PIO_WRITE_STATUS_PENDING_WRITE_COUNT_MASK) != val)
+ cpu_relax();
+
+ /* Next time drain this CPU's Shub */
+ *piowr_cpu = this_cpu;
+}
+
void sn_tlb_migrate_finish(struct mm_struct *mm)
{
if (mm = current->mm)
diff --git a/include/asm-ia64/machvec.h b/include/asm-ia64/machvec.h
index ca5ea99..9b54073 100644
--- a/include/asm-ia64/machvec.h
+++ b/include/asm-ia64/machvec.h
@@ -34,6 +34,7 @@ typedef int ia64_mv_pci_legacy_read_t (s
u8 size);
typedef int ia64_mv_pci_legacy_write_t (struct pci_bus *, u16 port, u32 val,
u8 size);
+typedef void ia64_mv_switch_from_t(struct task_struct * task);
/* DMA-mapping interface: */
typedef void ia64_mv_dma_init (void);
@@ -85,6 +86,11 @@ machvec_noop_mm (struct mm_struct *mm)
{
}
+static inline void
+machvec_noop_task (struct task_struct *task)
+{
+}
+
extern void machvec_setup (char **);
extern void machvec_timer_interrupt (int, void *, struct pt_regs *);
extern void machvec_dma_sync_single (struct device *, dma_addr_t, size_t, int);
@@ -146,6 +152,7 @@ extern void machvec_tlb_migrate_finish (
# define platform_readw_relaxed ia64_mv.readw_relaxed
# define platform_readl_relaxed ia64_mv.readl_relaxed
# define platform_readq_relaxed ia64_mv.readq_relaxed
+# define platform_switch_from ia64_mv.switch_from
# endif
/* __attribute__((__aligned__(16))) is required to make size of the
@@ -194,6 +201,7 @@ struct ia64_machine_vector {
ia64_mv_readw_relaxed_t *readw_relaxed;
ia64_mv_readl_relaxed_t *readl_relaxed;
ia64_mv_readq_relaxed_t *readq_relaxed;
+ ia64_mv_switch_from_t *switch_from;
} __attribute__((__aligned__(16))); /* align attrib? see above comment */
#define MACHVEC_INIT(name) \
@@ -238,6 +246,7 @@ struct ia64_machine_vector {
platform_readw_relaxed, \
platform_readl_relaxed, \
platform_readq_relaxed, \
+ platform_switch_from, \
}
extern struct ia64_machine_vector ia64_mv;
@@ -386,5 +395,8 @@ extern ia64_mv_dma_supported swiotlb_dm
#ifndef platform_readq_relaxed
# define platform_readq_relaxed __ia64_readq_relaxed
#endif
+#ifndef platform_switch_from
+# define platform_switch_from machvec_noop_task
+#endif
#endif /* _ASM_IA64_MACHVEC_H */
diff --git a/include/asm-ia64/machvec_sn2.h b/include/asm-ia64/machvec_sn2.h
index e1b6cd6..110dc54 100644
--- a/include/asm-ia64/machvec_sn2.h
+++ b/include/asm-ia64/machvec_sn2.h
@@ -1,5 +1,5 @@
/*
- * Copyright (c) 2002-2003 Silicon Graphics, Inc. All Rights Reserved.
+ * Copyright (c) 2002-2003,2006 Silicon Graphics, Inc. All Rights Reserved.
*
* This program is free software; you can redistribute it and/or modify it
* under the terms of version 2 of the GNU General Public License
@@ -71,6 +71,7 @@ extern ia64_mv_dma_sync_single_for_devic
extern ia64_mv_dma_sync_sg_for_device sn_dma_sync_sg_for_device;
extern ia64_mv_dma_mapping_error sn_dma_mapping_error;
extern ia64_mv_dma_supported sn_dma_supported;
+extern ia64_mv_switch_from_t sn_switch_from;
/*
* This stuff has dual use!
@@ -120,6 +121,7 @@ extern ia64_mv_dma_supported sn_dma_sup
#define platform_dma_sync_sg_for_device sn_dma_sync_sg_for_device
#define platform_dma_mapping_error sn_dma_mapping_error
#define platform_dma_supported sn_dma_supported
+#define platform_switch_from sn_switch_from
#include <asm/sn/io.h>
diff --git a/include/asm-ia64/system.h b/include/asm-ia64/system.h
index 80c5a23..0508327 100644
--- a/include/asm-ia64/system.h
+++ b/include/asm-ia64/system.h
@@ -243,6 +243,7 @@ extern void ia64_load_extra (struct task
(prev)->thread.flags |= IA64_THREAD_FPH_VALID; \
__ia64_save_fpu((prev)->thread.fph); \
} \
+ platform_switch_from(next); \
__switch_to(prev, next, last); \
} while (0)
#else
diff --git a/include/asm-ia64/thread_info.h b/include/asm-ia64/thread_info.h
index 1d6518f..c63494d 100644
--- a/include/asm-ia64/thread_info.h
+++ b/include/asm-ia64/thread_info.h
@@ -36,10 +36,19 @@ struct thread_info {
unsigned long start_time;
pid_t pid;
} sigdelayed; /* Saved information for TIF_SIGDELAYED */
+#if defined(CONFIG_IA64_SGI_SN2) || defined(CONFIG_IA64_GENERIC)
+ __u32 sn2_piowr_cpu; /* CPU needing PIO write drain */
+#endif
};
#define THREAD_SIZE KERNEL_STACK_SIZE
+#if defined(CONFIG_IA64_SGI_SN2) || defined(CONFIG_IA64_GENERIC)
+#define INIT_TI_SN2_PIOWR_CPU .sn2_piowr_cpu = 0,
+#else
+#define INIT_TI_SN2_PIOWR_CPU
+#endif
+
#define INIT_THREAD_INFO(tsk) \
{ \
.task = &tsk, \
@@ -51,6 +60,7 @@ struct thread_info {
.restart_block = { \
.fn = do_no_restart_syscall, \
}, \
+ INIT_TI_SN2_PIOWR_CPU \
}
#ifndef ASM_OFFSETS_C
^ permalink raw reply related [flat|nested] 73+ messages in thread* RE: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (2 preceding siblings ...)
2006-01-24 0:33 ` Brent Casavant
@ 2006-01-24 0:48 ` Luck, Tony
2006-01-24 1:23 ` Brent Casavant
` (60 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Luck, Tony @ 2006-01-24 0:48 UTC (permalink / raw)
To: linux-ia64
> The following patch implements this behavior on SN2 by waiting for a
> Shub register to indicate that these writes have been accepted (the
> non-SN2 case is a no-op). This is placed in the context switch-in
> path, and only performs the wait when the newly scheduled task changes
> CPUs.
> --- a/include/asm-ia64/system.h
> +++ b/include/asm-ia64/system.h
> @@ -243,6 +243,7 @@ extern void ia64_load_extra (struct task
> (prev)->thread.flags |= IA64_THREAD_FPH_VALID; \
> __ia64_save_fpu((prev)->thread.fph); \
> } \
> + platform_switch_from(next); \
> __switch_to(prev, next, last); \
> } while (0)
On a generic kernel running on non-sn2 h/w, surely this just
added a function call to an empty function into every context
switch?
Thinking about this a bit more, wouldn't it be better to do this
work in the rebalance code that is making the decision to move
the process. The earlier argument that this would sometimes
waste time because the process might not actually move doesn't
sound so strong ... we context switch frequently, we move processes
rarely. Doing some work that may be wasted in the rare path sounds
better than always doing work in the common path.
-Tony
^ permalink raw reply [flat|nested] 73+ messages in thread* RE: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (3 preceding siblings ...)
2006-01-24 0:48 ` Luck, Tony
@ 2006-01-24 1:23 ` Brent Casavant
2006-01-24 1:42 ` Keith Owens
` (59 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Brent Casavant @ 2006-01-24 1:23 UTC (permalink / raw)
To: linux-ia64
On Mon, 23 Jan 2006, Luck, Tony wrote:
> Thinking about this a bit more, wouldn't it be better to do this
> work in the rebalance code that is making the decision to move
> the process. The earlier argument that this would sometimes
> waste time because the process might not actually move doesn't
> sound so strong ... we context switch frequently, we move processes
> rarely. Doing some work that may be wasted in the rare path sounds
> better than always doing work in the common path.
That's my opinion too, and what the original patch implemented, but
met with a little resistance from Ingo (Ingo: sorry, I forgot to CC
you on the new patch to linux-ia64, let me know if you want a copy).
I agree with him that it's not wonderful to hook into mainline
scheduler code to solve this, but the alternative is what I presented
here: check if old CPU = new CPU at each context switch-in, entirely
within IA64 code.
An even better solution might be to have a hook in the main scheduler
which sets a "task has migrated" flag in the IA64 thread_info.flags and
add that flag to the IA64_HAS_EXTRA_STATE() macro. ia64_load_extra()
and a machvec can then do the heavy lifting. This way the migration
code would never do anything more than flip a bit, and we'd only make
the function call and wait on the Shub register at switch-in time on
the new CPU. A side-effect of this would be laying most of the
necessary groundwork for what Jack Steiner referred to regarding
a future platform.
Does that, at least on the surface, sound palatable to each of you?
(I'm obviously not asking for a commitment to accept it, just "the
idea doesn't make me wretch").
Brent
--
Brent Casavant All music is folk music. I ain't
bcasavan@sgi.com never heard a horse sing a song.
Silicon Graphics, Inc. -- Louis Armstrong
^ permalink raw reply [flat|nested] 73+ messages in thread* Re: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (4 preceding siblings ...)
2006-01-24 1:23 ` Brent Casavant
@ 2006-01-24 1:42 ` Keith Owens
2006-01-24 3:41 ` Grant Grundler
` (58 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Keith Owens @ 2006-01-24 1:42 UTC (permalink / raw)
To: linux-ia64
Brent Casavant (on Mon, 23 Jan 2006 19:23:50 -0600 (CST)) wrote:
>An even better solution might be to have a hook in the main scheduler
>which sets a "task has migrated" flag in the IA64 thread_info.flags and
>add that flag to the IA64_HAS_EXTRA_STATE() macro. ia64_load_extra()
>and a machvec can then do the heavy lifting.
The problem with setting arch specific flags in mailine code is the
need to use #ifdef or to define dummy funcions for all architectures.
Apart from that, the idea works for me. You can pinch the
TIF_SIGDELAYED bit, which I just removed.
^ permalink raw reply [flat|nested] 73+ messages in thread* Re: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (5 preceding siblings ...)
2006-01-24 1:42 ` Keith Owens
@ 2006-01-24 3:41 ` Grant Grundler
2006-01-24 6:30 ` Brent Casavant
` (57 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Grant Grundler @ 2006-01-24 3:41 UTC (permalink / raw)
To: linux-ia64
On Mon, Jan 23, 2006 at 06:33:39PM -0600, Brent Casavant wrote:
> However, when MMIO writes can be performed from user space (e.g. DRM)
> there are no such guarantees and mechanisms, as the process may
> context-switch at any time, and may migrate to a different CPU as part
> of the switch. For such programs/hardware to operate correctly, it is
> required that the MMIO writes from the old CPU be accepted by the IO
> hardware before subsequent writes from the new CPU can be issued.
Would an app first need to mmap an uncached address?
Or some other obvious "marker" that might warn the kernel?
I'm wondering if one could check some easily reachable state
(or flag) before calling platform_switch_from() and thus
mitigate Tony's concern.
grant
^ permalink raw reply [flat|nested] 73+ messages in thread* Re: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (6 preceding siblings ...)
2006-01-24 3:41 ` Grant Grundler
@ 2006-01-24 6:30 ` Brent Casavant
2006-01-24 6:41 ` Brent Casavant
` (56 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Brent Casavant @ 2006-01-24 6:30 UTC (permalink / raw)
To: linux-ia64
On Mon, 23 Jan 2006, Grant Grundler wrote:
> Would an app first need to mmap an uncached address?
> Or some other obvious "marker" that might warn the kernel?
I'm not sure an uncached address would be appropriate, as I
believe there are other non-IO uncached mappings which can
be performed (whatever the fetchop stuff morphed into, for
example -- I can't remember what its called now).
The only obvious marker I can thing of would be a flag in the
mm struct set by a device driver when such a mapping occurs.
All relevant device drivers would need to be updated, and we'd
need to handle partial unmappings, shared mappings and the like.
> I'm wondering if one could check some easily reachable state
> (or flag) before calling platform_switch_from() and thus
> mitigate Tony's concern.
I can't think of any obviously correct state/flag to inspect,
short of the other suggestion I made tonight.
Brent
--
Brent Casavant All music is folk music. I ain't
bcasavan@sgi.com never heard a horse sing a song.
Silicon Graphics, Inc. -- Louis Armstrong
^ permalink raw reply [flat|nested] 73+ messages in thread* Re: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (7 preceding siblings ...)
2006-01-24 6:30 ` Brent Casavant
@ 2006-01-24 6:41 ` Brent Casavant
2006-01-24 7:04 ` Grant Grundler
` (55 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Brent Casavant @ 2006-01-24 6:41 UTC (permalink / raw)
To: linux-ia64
On Tue, 24 Jan 2006, Keith Owens wrote:
> The problem with setting arch specific flags in mailine code is the
> need to use #ifdef or to define dummy funcions for all architectures.
> Apart from that, the idea works for me. You can pinch the
> TIF_SIGDELAYED bit, which I just removed.
It's actually quite clean. See prepare_arch_switch() and
finish_arch_switch() as an example of how I'd pattern the code.
Those are done by #ifdef, but very clearly isolated.
Hmm, I might be able to utilize prepare_arch_switch() -- I'll have
to take a closer look when I get back into the office Tuesday. I
sure hope that would work, as it would avoid the need for changes to
kernel/sched.c.
Brent
--
Brent Casavant All music is folk music. I ain't
bcasavan@sgi.com never heard a horse sing a song.
Silicon Graphics, Inc. -- Louis Armstrong
^ permalink raw reply [flat|nested] 73+ messages in thread* Re: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (8 preceding siblings ...)
2006-01-24 6:41 ` Brent Casavant
@ 2006-01-24 7:04 ` Grant Grundler
2006-01-24 9:02 ` Ingo Molnar
` (54 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Grant Grundler @ 2006-01-24 7:04 UTC (permalink / raw)
To: linux-ia64
On Tue, Jan 24, 2006 at 12:30:42AM -0600, Brent Casavant wrote:
> On Mon, 23 Jan 2006, Grant Grundler wrote:
> > Would an app first need to mmap an uncached address?
> > Or some other obvious "marker" that might warn the kernel?
>
> I'm not sure an uncached address would be appropriate, as I
> believe there are other non-IO uncached mappings which can
> be performed (whatever the fetchop stuff morphed into, for
> example -- I can't remember what its called now).
OK - just not many user space programs map uncache IO space.
> The only obvious marker I can thing of would be a flag in the
> mm struct set by a device driver when such a mapping occurs.
Device driver? Your original email only talks about user space:
http://marc.theaimsgroup.com/?l=linux-ia64&m\x113771591308925&w=2
Something that has to go through syscall/mmap and eventually
into IA64 specific code to setup a virtual address.
Or did I miss some later bit about a kernel driver?
> I can't think of any obviously correct state/flag to inspect,
> short of the other suggestion I made tonight.
ok. nevermind then...
grant
^ permalink raw reply [flat|nested] 73+ messages in thread* Re: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (9 preceding siblings ...)
2006-01-24 7:04 ` Grant Grundler
@ 2006-01-24 9:02 ` Ingo Molnar
2006-01-24 9:14 ` Jes Sorensen
` (53 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Ingo Molnar @ 2006-01-24 9:02 UTC (permalink / raw)
To: linux-ia64
* Brent Casavant <bcasavan@sgi.com> wrote:
> I agree with him that it's not wonderful to hook into mainline
> scheduler code to solve this, but the alternative is what I presented
> here: check if old CPU = new CPU at each context switch-in, entirely
> within IA64 code.
what's the problem with doing that? It's a single comparison [of two
values we already have accessed in that function!] and a rarely taken
branch in a single subarch of a single arch's switch_to() function. If
this use (of migration related arch functionality) becomes more
widespread then we can generalize it, but right now i just dont see the
point, given that the two solutions are almost totally equivalent in
terms of "overhead".
Ingo
^ permalink raw reply [flat|nested] 73+ messages in thread* Re: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (10 preceding siblings ...)
2006-01-24 9:02 ` Ingo Molnar
@ 2006-01-24 9:14 ` Jes Sorensen
2006-01-24 12:10 ` Robin Holt
` (52 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Jes Sorensen @ 2006-01-24 9:14 UTC (permalink / raw)
To: linux-ia64
Luck, Tony wrote:
> Thinking about this a bit more, wouldn't it be better to do this
> work in the rebalance code that is making the decision to move
> the process. The earlier argument that this would sometimes
> waste time because the process might not actually move doesn't
> sound so strong ... we context switch frequently, we move processes
> rarely. Doing some work that may be wasted in the rare path sounds
> better than always doing work in the common path.
Alternatively one could make platform_switch_from() a wrapper around
a function pointer that is only set when in use. Then it would be a
load and a compare and the rest will be predicated away on non SN
hardware in the generic kernel.
Cheers,
Jes
^ permalink raw reply [flat|nested] 73+ messages in thread* Re: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (11 preceding siblings ...)
2006-01-24 9:14 ` Jes Sorensen
@ 2006-01-24 12:10 ` Robin Holt
2006-01-24 16:40 ` Grant Grundler
` (51 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Robin Holt @ 2006-01-24 12:10 UTC (permalink / raw)
To: linux-ia64
On Mon, Jan 23, 2006 at 11:04:21PM -0800, Grant Grundler wrote:
> On Tue, Jan 24, 2006 at 12:30:42AM -0600, Brent Casavant wrote:
> > On Mon, 23 Jan 2006, Grant Grundler wrote:
> > > Would an app first need to mmap an uncached address?
> > > Or some other obvious "marker" that might warn the kernel?
> >
> > I'm not sure an uncached address would be appropriate, as I
> > believe there are other non-IO uncached mappings which can
> > be performed (whatever the fetchop stuff morphed into, for
> > example -- I can't remember what its called now).
>
> OK - just not many user space programs map uncache IO space.
The mspec driver does. It has not gotten into the mainline kernel yet,
but we continue to build an add-on module for the necessary releases.
To prevent this out of order write from happening on those machines,
we require the process to be pinned to a single cpu. For MPI, this
is a normal way to operate for other reasons as well.
Brent, if you find a way to prevent this out-of-order write, please
remind me to include it in the mspec driver.
Thanks,
Robin
^ permalink raw reply [flat|nested] 73+ messages in thread* Re: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (12 preceding siblings ...)
2006-01-24 12:10 ` Robin Holt
@ 2006-01-24 16:40 ` Grant Grundler
2006-01-24 16:52 ` Brent Casavant
` (50 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Grant Grundler @ 2006-01-24 16:40 UTC (permalink / raw)
To: linux-ia64
On Tue, Jan 24, 2006 at 06:10:52AM -0600, Robin Holt wrote:
> > OK - just not many user space programs map uncache IO space.
>
> The mspec driver does. It has not gotten into the mainline kernel yet,
> but we continue to build an add-on module for the necessary releases.
I don't know what mspec driver does. I'll guess it's something
similar to RDMA but using altix/sn2 chipset as the "fabric".
Ie part of the "driver" is really a user space lib.
> To prevent this out of order write from happening on those machines,
> we require the process to be pinned to a single cpu. For MPI, this
> is a normal way to operate for other reasons as well.
My objective was to figure out if the kernel could detect when an
application might access MMIO space and then a platform vector
could be invoked to deal with the MMIO write ordering.
But maybe that's not necessary if you can avoid the problem by
pinning the process in the mspec user space library. Then
the whole problem of process migration can be handled
in mspec user library, no?
Maybe that's a moot discussion.
> Brent, if you find a way to prevent this out-of-order write, please
> remind me to include it in the mspec driver.
RDMA support will need something more general.
(I'm asuming mspec is sn2 specific.)
thanks,
grant
^ permalink raw reply [flat|nested] 73+ messages in thread* Re: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (13 preceding siblings ...)
2006-01-24 16:40 ` Grant Grundler
@ 2006-01-24 16:52 ` Brent Casavant
2006-01-24 16:57 ` Brent Casavant
` (49 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Brent Casavant @ 2006-01-24 16:52 UTC (permalink / raw)
To: linux-ia64
On Mon, 23 Jan 2006, Grant Grundler wrote:
> On Tue, Jan 24, 2006 at 12:30:42AM -0600, Brent Casavant wrote:
> > The only obvious marker I can thing of would be a flag in the
> > mm struct set by a device driver when such a mapping occurs.
>
> Device driver? Your original email only talks about user space:
> http://marc.theaimsgroup.com/?l=linux-ia64&m\x113771591308925&w=2
Right. But the whole situation occurs when user space mmap()s
an IO device (e.g. a PCI aperture). That mmap() call triggers
kernel device driver mmap code to take place, and that is where
the flag would need to be set.
Sorry I wasn't clear about that.
Brent
--
Brent Casavant All music is folk music. I ain't
bcasavan@sgi.com never heard a horse sing a song.
Silicon Graphics, Inc. -- Louis Armstrong
^ permalink raw reply [flat|nested] 73+ messages in thread* Re: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (14 preceding siblings ...)
2006-01-24 16:52 ` Brent Casavant
@ 2006-01-24 16:57 ` Brent Casavant
2006-01-24 17:00 ` Robin Holt
` (48 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Brent Casavant @ 2006-01-24 16:57 UTC (permalink / raw)
To: linux-ia64
On Tue, 24 Jan 2006, Robin Holt wrote:
> The mspec driver does. It has not gotten into the mainline kernel yet,
> but we continue to build an add-on module for the necessary releases.
> To prevent this out of order write from happening on those machines,
> we require the process to be pinned to a single cpu. For MPI, this
> is a normal way to operate for other reasons as well.
Oh, this problem affects mspec as well? I didn't realize that. You're
saying that uncached memory writes are the same as PIO writes, as far
as Shub is concerned? If so, then the patch I'm trying to work through
here should solve your problem as well.
Brent
--
Brent Casavant All music is folk music. I ain't
bcasavan@sgi.com never heard a horse sing a song.
Silicon Graphics, Inc. -- Louis Armstrong
^ permalink raw reply [flat|nested] 73+ messages in thread* Re: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (15 preceding siblings ...)
2006-01-24 16:57 ` Brent Casavant
@ 2006-01-24 17:00 ` Robin Holt
2006-01-24 17:33 ` Luck, Tony
` (47 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Robin Holt @ 2006-01-24 17:00 UTC (permalink / raw)
To: linux-ia64
On Tue, Jan 24, 2006 at 10:57:13AM -0600, Brent Casavant wrote:
> On Tue, 24 Jan 2006, Robin Holt wrote:
>
> > The mspec driver does. It has not gotten into the mainline kernel yet,
> > but we continue to build an add-on module for the necessary releases.
> > To prevent this out of order write from happening on those machines,
> > we require the process to be pinned to a single cpu. For MPI, this
> > is a normal way to operate for other reasons as well.
>
> Oh, this problem affects mspec as well? I didn't realize that. You're
> saying that uncached memory writes are the same as PIO writes, as far
> as Shub is concerned? If so, then the patch I'm trying to work through
> here should solve your problem as well.
Sorry, I wrote that before having any morning coffee. mspec driver is
uncached space so ignore my earlier rambling.
Thanks,
Robin
^ permalink raw reply [flat|nested] 73+ messages in thread* Re: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (16 preceding siblings ...)
2006-01-24 17:00 ` Robin Holt
@ 2006-01-24 17:33 ` Luck, Tony
2006-01-24 18:42 ` Grant Grundler
` (46 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Luck, Tony @ 2006-01-24 17:33 UTC (permalink / raw)
To: linux-ia64
On Tue, Jan 24, 2006 at 10:02:35AM +0100, Ingo Molnar wrote:
> what's the problem with doing that? It's a single comparison [of two
> values we already have accessed in that function!] and a rarely taken
> branch in a single subarch of a single arch's switch_to() function. If
> this use (of migration related arch functionality) becomes more
> widespread then we can generalize it, but right now i just dont see the
> point, given that the two solutions are almost totally equivalent in
> terms of "overhead".
As coded the comparison wasn't inline, it was inside the machvec function,
so we'd have to make an indirect function call in order to make the
comparison and then do nothing most of the time. I'd be happier with:
if (unlikely(foo != bar)) /* migration */
platform_switch_from(next);
But I'd also like to see how invasive a "task has migrated" bit in
thread_info.flags (as suggested elsewhere in this thread) gets to
be.
-Tony
^ permalink raw reply [flat|nested] 73+ messages in thread* Re: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (17 preceding siblings ...)
2006-01-24 17:33 ` Luck, Tony
@ 2006-01-24 18:42 ` Grant Grundler
2006-01-24 21:12 ` Brent Casavant
` (45 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Grant Grundler @ 2006-01-24 18:42 UTC (permalink / raw)
To: linux-ia64
On Tue, Jan 24, 2006 at 08:40:55AM -0800, Grant Grundler wrote:
> > Brent, if you find a way to prevent this out-of-order write, please
> > remind me to include it in the mspec driver.
>
> RDMA support will need something more general.
> (I'm asuming mspec is sn2 specific.)
nevermind...I think just made the same mistake as Brent.
RDMA userspace is talking to host memory, not MMIO.
grant
^ permalink raw reply [flat|nested] 73+ messages in thread* Re: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (18 preceding siblings ...)
2006-01-24 18:42 ` Grant Grundler
@ 2006-01-24 21:12 ` Brent Casavant
2006-01-24 21:41 ` Ingo Molnar
` (44 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Brent Casavant @ 2006-01-24 21:12 UTC (permalink / raw)
To: linux-ia64
On Tue, 24 Jan 2006, Luck, Tony wrote:
> As coded the comparison wasn't inline, it was inside the machvec function,
> so we'd have to make an indirect function call in order to make the
> comparison and then do nothing most of the time. I'd be happier with:
Yeah, sorry about that. Should have occurred to me.
> if (unlikely(foo != bar)) /* migration */
> platform_switch_from(next);
The new patch below implements pretty much exactly that. I'm not
thrilled about carrying around the thread_info.last_cpu field even
on non-SN2, but making the presence/updating of the field conditional
led to huge headaches in the "should we call platform_switch_from"
conditionals/logic.
> But I'd also like to see how invasive a "task has migrated" bit in
> thread_info.flags (as suggested elsewhere in this thread) gets to
> be.
I'll be happy to do this if you really want. That said, having stared
at this code enough now, I don't believe it would be a better solution.
The migrated bit is useful only if we don't track the last CPU the thread
ran on (which we need to do anyway, in order to know which Shub needs
to be drained of PIO writes).
Implementing the migrated bit would require an arch hook in the main
scheduler to set the bit in most places that set_task_cpu() is called.
The bit would then be tested and acted upon in close to the same place
the patch below. The patch below is optimal with respect to how often
it calls platfrom_switch_from(), short of going through the complexity
of adding a "user MMIO mappings" counter to the mm struct and managing
the counter in relevant drivers.
So, does this look better?
Brent
diff --git a/arch/ia64/sn/kernel/setup.c b/arch/ia64/sn/kernel/setup.c
index e510dce..a687260 100644
--- a/arch/ia64/sn/kernel/setup.c
+++ b/arch/ia64/sn/kernel/setup.c
@@ -3,7 +3,7 @@
* License. See the file "COPYING" in the main directory of this archive
* for more details.
*
- * Copyright (C) 1999,2001-2005 Silicon Graphics, Inc. All rights reserved.
+ * Copyright (C) 1999,2001-2006 Silicon Graphics, Inc. All rights reserved.
*/
#include <linux/config.h>
@@ -654,7 +654,8 @@ void __init sn_cpu_init(void)
SH2_PIO_WRITE_STATUS_1, SH2_PIO_WRITE_STATUS_3};
u64 *pio;
pio = is_shub1() ? pio1 : pio2;
- pda->pio_write_status_addr = (volatile unsigned long *) LOCAL_MMR_ADDR(pio[slice]);
+ pda->pio_write_status_addr = (volatile unsigned long *)
+ GLOBAL_MMR_ADDR(nasid, pio[slice]);
pda->pio_write_status_val = is_shub1() ? SH_PIO_WRITE_STATUS_PENDING_WRITE_COUNT_MASK : 0;
}
diff --git a/arch/ia64/sn/kernel/sn2/sn2_smp.c b/arch/ia64/sn/kernel/sn2/sn2_smp.c
index 471bbaa..e8addd4 100644
--- a/arch/ia64/sn/kernel/sn2/sn2_smp.c
+++ b/arch/ia64/sn/kernel/sn2/sn2_smp.c
@@ -5,7 +5,7 @@
* License. See the file "COPYING" in the main directory of this archive
* for more details.
*
- * Copyright (C) 2000-2005 Silicon Graphics, Inc. All rights reserved.
+ * Copyright (C) 2000-2006 Silicon Graphics, Inc. All rights reserved.
*/
#include <linux/init.h>
@@ -169,6 +169,26 @@ static inline unsigned long wait_piowc(v
return ws;
}
+/**
+ * sn_switch_from - SN-specific migrate-from actions
+ * @task: Task being switched to new CPU
+ *
+ * SN2 PIO writes from separate CPUs are not guaranteed to arrive in order.
+ * Context switching user threads which have memory-mapped MMIO may cause
+ * PIOs to issue from seperate CPUs, thus the PIO writes must be drained
+ * from the previous CPU's Shub before execution resumes on the new CPU.
+ */
+void sn_switch_from(struct task_struct *task)
+{
+ pda_t *last_pda = pdacpu(task_thread_info(task)->last_cpu);
+ volatile unsigned long *adr = last_pda->pio_write_status_addr;
+ unsigned long val = last_pda->pio_write_status_val;
+
+ /* Drain PIO writes from old CPU's Shub */
+ while ((*adr & SH_PIO_WRITE_STATUS_PENDING_WRITE_COUNT_MASK) != val)
+ cpu_relax();
+}
+
void sn_tlb_migrate_finish(struct mm_struct *mm)
{
if (mm = current->mm)
diff --git a/include/asm-ia64/machvec.h b/include/asm-ia64/machvec.h
index ca5ea99..a739a08 100644
--- a/include/asm-ia64/machvec.h
+++ b/include/asm-ia64/machvec.h
@@ -20,6 +20,7 @@ struct scatterlist;
struct page;
struct mm_struct;
struct pci_bus;
+struct task_struct;
typedef void ia64_mv_setup_t (char **);
typedef void ia64_mv_cpu_init_t (void);
@@ -34,6 +35,7 @@ typedef int ia64_mv_pci_legacy_read_t (s
u8 size);
typedef int ia64_mv_pci_legacy_write_t (struct pci_bus *, u16 port, u32 val,
u8 size);
+typedef void ia64_mv_switch_from_t(struct task_struct * task);
/* DMA-mapping interface: */
typedef void ia64_mv_dma_init (void);
@@ -85,6 +87,11 @@ machvec_noop_mm (struct mm_struct *mm)
{
}
+static inline void
+machvec_noop_task (struct task_struct *task)
+{
+}
+
extern void machvec_setup (char **);
extern void machvec_timer_interrupt (int, void *, struct pt_regs *);
extern void machvec_dma_sync_single (struct device *, dma_addr_t, size_t, int);
@@ -146,6 +153,7 @@ extern void machvec_tlb_migrate_finish (
# define platform_readw_relaxed ia64_mv.readw_relaxed
# define platform_readl_relaxed ia64_mv.readl_relaxed
# define platform_readq_relaxed ia64_mv.readq_relaxed
+# define platform_switch_from ia64_mv.switch_from
# endif
/* __attribute__((__aligned__(16))) is required to make size of the
@@ -194,6 +202,7 @@ struct ia64_machine_vector {
ia64_mv_readw_relaxed_t *readw_relaxed;
ia64_mv_readl_relaxed_t *readl_relaxed;
ia64_mv_readq_relaxed_t *readq_relaxed;
+ ia64_mv_switch_from_t *switch_from;
} __attribute__((__aligned__(16))); /* align attrib? see above comment */
#define MACHVEC_INIT(name) \
@@ -238,6 +247,7 @@ struct ia64_machine_vector {
platform_readw_relaxed, \
platform_readl_relaxed, \
platform_readq_relaxed, \
+ platform_switch_from, \
}
extern struct ia64_machine_vector ia64_mv;
@@ -386,5 +396,8 @@ extern ia64_mv_dma_supported swiotlb_dm
#ifndef platform_readq_relaxed
# define platform_readq_relaxed __ia64_readq_relaxed
#endif
+#ifndef platform_switch_from
+# define platform_switch_from machvec_noop_task
+#endif
#endif /* _ASM_IA64_MACHVEC_H */
diff --git a/include/asm-ia64/machvec_sn2.h b/include/asm-ia64/machvec_sn2.h
index e1b6cd6..110dc54 100644
--- a/include/asm-ia64/machvec_sn2.h
+++ b/include/asm-ia64/machvec_sn2.h
@@ -1,5 +1,5 @@
/*
- * Copyright (c) 2002-2003 Silicon Graphics, Inc. All Rights Reserved.
+ * Copyright (c) 2002-2003,2006 Silicon Graphics, Inc. All Rights Reserved.
*
* This program is free software; you can redistribute it and/or modify it
* under the terms of version 2 of the GNU General Public License
@@ -71,6 +71,7 @@ extern ia64_mv_dma_sync_single_for_devic
extern ia64_mv_dma_sync_sg_for_device sn_dma_sync_sg_for_device;
extern ia64_mv_dma_mapping_error sn_dma_mapping_error;
extern ia64_mv_dma_supported sn_dma_supported;
+extern ia64_mv_switch_from_t sn_switch_from;
/*
* This stuff has dual use!
@@ -120,6 +121,7 @@ extern ia64_mv_dma_supported sn_dma_sup
#define platform_dma_sync_sg_for_device sn_dma_sync_sg_for_device
#define platform_dma_mapping_error sn_dma_mapping_error
#define platform_dma_supported sn_dma_supported
+#define platform_switch_from sn_switch_from
#include <asm/sn/io.h>
diff --git a/include/asm-ia64/system.h b/include/asm-ia64/system.h
index 80c5a23..da8728d 100644
--- a/include/asm-ia64/system.h
+++ b/include/asm-ia64/system.h
@@ -243,6 +243,10 @@ extern void ia64_load_extra (struct task
(prev)->thread.flags |= IA64_THREAD_FPH_VALID; \
__ia64_save_fpu((prev)->thread.fph); \
} \
+ if (unlikely(task_cpu(next) != task_thread_info(next)->last_cpu)) { \
+ platform_switch_from(next); \
+ task_thread_info(next)->last_cpu = task_cpu(next); \
+ } \
__switch_to(prev, next, last); \
} while (0)
#else
diff --git a/include/asm-ia64/thread_info.h b/include/asm-ia64/thread_info.h
index 1d6518f..02692b3 100644
--- a/include/asm-ia64/thread_info.h
+++ b/include/asm-ia64/thread_info.h
@@ -36,6 +36,7 @@ struct thread_info {
unsigned long start_time;
pid_t pid;
} sigdelayed; /* Saved information for TIF_SIGDELAYED */
+ __u32 last_cpu; /* Last CPU thread ran on */
};
#define THREAD_SIZE KERNEL_STACK_SIZE
@@ -51,6 +52,7 @@ struct thread_info {
.restart_block = { \
.fn = do_no_restart_syscall, \
}, \
+ .last_cpu = 0, \
}
#ifndef ASM_OFFSETS_C
^ permalink raw reply related [flat|nested] 73+ messages in thread* Re: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (19 preceding siblings ...)
2006-01-24 21:12 ` Brent Casavant
@ 2006-01-24 21:41 ` Ingo Molnar
2006-01-24 21:43 ` Chen, Kenneth W
` (43 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Ingo Molnar @ 2006-01-24 21:41 UTC (permalink / raw)
To: linux-ia64
* Brent Casavant <bcasavan@sgi.com> wrote:
> +/**
> + * sn_switch_from - SN-specific migrate-from actions
> + * @task: Task being switched to new CPU
> + *
> + * SN2 PIO writes from separate CPUs are not guaranteed to arrive in order.
> + * Context switching user threads which have memory-mapped MMIO may cause
> + * PIOs to issue from seperate CPUs, thus the PIO writes must be drained
> + * from the previous CPU's Shub before execution resumes on the new CPU.
> + */
> +void sn_switch_from(struct task_struct *task)
> +{
> + pda_t *last_pda = pdacpu(task_thread_info(task)->last_cpu);
> + volatile unsigned long *adr = last_pda->pio_write_status_addr;
> + unsigned long val = last_pda->pio_write_status_val;
> +
> + /* Drain PIO writes from old CPU's Shub */
> + while ((*adr & SH_PIO_WRITE_STATUS_PENDING_WRITE_COUNT_MASK) != val)
> + cpu_relax();
> +}
btw., note that if such PIO writes happen frequently on a system, you
_really_ want this to be executed as late as possible - to give those
writes a chance to drain while this CPU is doing other useful work. In
that sense, doing the draining in the migration decision code is pretty
much the worst place to do it, because that's the _soonest_ point we
know that the task will migrate. That might still be milliseconds away
from actually hitting the CPU. Even if it was migrated to a totally idle
CPU, there's microseconds of codepath between the migration decision,
and the actual point in time where the switch_to() executes.
a further detail: in that sense i'd rather suggest to move the condition
to after the __switch_to() call. You can drain anytime, as long as it's
done before userspace can issue new PIO writes. So you could as well do
it in the new context.
Ingo
^ permalink raw reply [flat|nested] 73+ messages in thread* RE: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (20 preceding siblings ...)
2006-01-24 21:41 ` Ingo Molnar
@ 2006-01-24 21:43 ` Chen, Kenneth W
2006-01-24 21:51 ` Luck, Tony
` (42 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Chen, Kenneth W @ 2006-01-24 21:43 UTC (permalink / raw)
To: linux-ia64
Brent Casavant wrote on Tuesday, January 24, 2006 1:13 PM
> The new patch below implements pretty much exactly that. I'm not
> thrilled about carrying around the thread_info.last_cpu field even
> on non-SN2, but making the presence/updating of the field conditional
> led to huge headaches in the "should we call platform_switch_from"
> conditionals/logic.
Yeah, I'm not thrilled either. Currently, on context switch kernel
doesn't read thread_info.cpu, nor any other neighboring field. With
the patch, it needs to do a cacheline read and possibly an update.
If unlucky, it might be two cache lines with last_cpu 92 bytes away
from thread_info.cpu. Perhaps, last_cpu should be next to cpu field.
On the other hand, it may not be too bad because on kernel exit, flags
field in thread_info is accessed for TIF_* and if kernel exit path is
taken back-to-back with context switch, the net effect maybe small.
- Ken
^ permalink raw reply [flat|nested] 73+ messages in thread* Re: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (21 preceding siblings ...)
2006-01-24 21:43 ` Chen, Kenneth W
@ 2006-01-24 21:51 ` Luck, Tony
2006-01-24 22:04 ` Brent Casavant
` (41 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Luck, Tony @ 2006-01-24 21:51 UTC (permalink / raw)
To: linux-ia64
On Tue, Jan 24, 2006 at 03:12:48PM -0600, Brent Casavant wrote:
> > But I'd also like to see how invasive a "task has migrated" bit in
> > thread_info.flags (as suggested elsewhere in this thread) gets to
> > be.
>
> I'll be happy to do this if you really want. That said, having stared
> at this code enough now, I don't believe it would be a better solution.
You are probably right.
> So, does this look better?
> + .last_cpu = 0, \
How does .last_cpu get propagated to new processes, and what is
it initialized to? Right now it looks like there is an odd
quirk that we'll call platform_switch_from() as each process starts
running for the first time [or is the whole thread_info structure
copied from the parent someplace that I didn't notice].
-Tony
^ permalink raw reply [flat|nested] 73+ messages in thread* Re: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (22 preceding siblings ...)
2006-01-24 21:51 ` Luck, Tony
@ 2006-01-24 22:04 ` Brent Casavant
2006-01-24 22:07 ` Chen, Kenneth W
` (40 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Brent Casavant @ 2006-01-24 22:04 UTC (permalink / raw)
To: linux-ia64
On Tue, 24 Jan 2006, Ingo Molnar wrote:
> btw., note that if such PIO writes happen frequently on a system, you
> _really_ want this to be executed as late as possible - to give those
> writes a chance to drain while this CPU is doing other useful work. In
> that sense, doing the draining in the migration decision code is pretty
> much the worst place to do it, because that's the _soonest_ point we
> know that the task will migrate. That might still be milliseconds away
> from actually hitting the CPU. Even if it was migrated to a totally idle
> CPU, there's microseconds of codepath between the migration decision,
> and the actual point in time where the switch_to() executes.
Right. Which is of course one of the reasons my original patch sucked.
Since we're now running all the new code at the time of context switch
(and the potentially slow part only when we're *sure* we've switched CPUs),
it should now be about as good as it can get.
> a further detail: in that sense i'd rather suggest to move the condition
> to after the __switch_to() call. You can drain anytime, as long as it's
> done before userspace can issue new PIO writes. So you could as well do
> it in the new context.
Done. The new context is the right place for this to execute anyway
(not that it matters functionally).
Patch following in a few minutes.
Brent
--
Brent Casavant All music is folk music. I ain't
bcasavan@sgi.com never heard a horse sing a song.
Silicon Graphics, Inc. -- Louis Armstrong
^ permalink raw reply [flat|nested] 73+ messages in thread* RE: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (23 preceding siblings ...)
2006-01-24 22:04 ` Brent Casavant
@ 2006-01-24 22:07 ` Chen, Kenneth W
2006-01-24 22:12 ` Brent Casavant
` (39 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Chen, Kenneth W @ 2006-01-24 22:07 UTC (permalink / raw)
To: linux-ia64
Luck, Tony wrote on Tuesday, January 24, 2006 1:51 PM
> > + .last_cpu = 0, \
>
> How does .last_cpu get propagated to new processes, and what is
> it initialized to? Right now it looks like there is an odd
> quirk that we'll call platform_switch_from() as each process starts
> running for the first time [or is the whole thread_info structure
> copied from the parent someplace that I didn't notice].
I think it is copy from dup_task_struct(): setup_thread_stack(tsk, orig)
and
void setup_thread_stack(struct task_struct *p, struct task_struct *org)
{
*task_thread_info(p) = *task_thread_info(org);
task_thread_info(p)->task = p;
}
^ permalink raw reply [flat|nested] 73+ messages in thread* RE: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (24 preceding siblings ...)
2006-01-24 22:07 ` Chen, Kenneth W
@ 2006-01-24 22:12 ` Brent Casavant
2006-01-24 22:19 ` Chen, Kenneth W
` (38 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Brent Casavant @ 2006-01-24 22:12 UTC (permalink / raw)
To: linux-ia64
On Tue, 24 Jan 2006, Chen, Kenneth W wrote:
> Yeah, I'm not thrilled either. Currently, on context switch kernel
> doesn't read thread_info.cpu, nor any other neighboring field. With
> the patch, it needs to do a cacheline read and possibly an update.
> If unlucky, it might be two cache lines with last_cpu 92 bytes away
> from thread_info.cpu. Perhaps, last_cpu should be next to cpu field.
Done.
> On the other hand, it may not be too bad because on kernel exit, flags
> field in thread_info is accessed for TIF_* and if kernel exit path is
> taken back-to-back with context switch, the net effect maybe small.
Actually, IA64_HAS_EXTRA_STATE() reads the TIF_* flags, and is called
in __switch_to(), which is in very close proximity to checking last_cpu.
I don't think there'll be a problem here, particularly as I've now moved
last_cpu to neighbor cpu.
Updated patch following in a few minutes.
Brent
--
Brent Casavant All music is folk music. I ain't
bcasavan@sgi.com never heard a horse sing a song.
Silicon Graphics, Inc. -- Louis Armstrong
^ permalink raw reply [flat|nested] 73+ messages in thread* RE: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (25 preceding siblings ...)
2006-01-24 22:12 ` Brent Casavant
@ 2006-01-24 22:19 ` Chen, Kenneth W
2006-01-24 22:31 ` Chen, Kenneth W
` (37 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Chen, Kenneth W @ 2006-01-24 22:19 UTC (permalink / raw)
To: linux-ia64
Brent Casavant wrote on Tuesday, January 24, 2006 2:12 PM
> > On the other hand, it may not be too bad because on kernel exit, flags
> > field in thread_info is accessed for TIF_* and if kernel exit path is
> > taken back-to-back with context switch, the net effect maybe small.
>
> Actually, IA64_HAS_EXTRA_STATE() reads the TIF_* flags, and is called
> in __switch_to(), which is in very close proximity to checking last_cpu.
Where??
IA64_HAS_EXTRA_STATE() uses task->thread.flags, not task->thread_info->flags.
These two flags are in two different places.
- Ken
^ permalink raw reply [flat|nested] 73+ messages in thread* RE: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (26 preceding siblings ...)
2006-01-24 22:19 ` Chen, Kenneth W
@ 2006-01-24 22:31 ` Chen, Kenneth W
2006-01-24 22:41 ` Brent Casavant
` (36 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Chen, Kenneth W @ 2006-01-24 22:31 UTC (permalink / raw)
To: linux-ia64
Brent Casavant wrote on Tuesday, January 24, 2006 1:13 PM
> The new patch below implements pretty much exactly that. I'm not
> thrilled about carrying around the thread_info.last_cpu field even
> on non-SN2, but making the presence/updating of the field conditional
> led to huge headaches in the "should we call platform_switch_from"
> conditionals/logic.
Chen, Kenneth wrote on Tuesday, January 24, 2006 1:44 PM
> Yeah, I'm not thrilled either. Currently, on context switch kernel
> doesn't read thread_info.cpu, nor any other neighboring field. With
> the patch, it needs to do a cacheline read and possibly an update.
> If unlucky, it might be two cache lines with last_cpu 92 bytes away
> from thread_info.cpu. Perhaps, last_cpu should be next to cpu field.
> On the other hand, it may not be too bad because on kernel exit, flags
> field in thread_info is accessed for TIF_* and if kernel exit path is
> taken back-to-back with context switch, the net effect maybe small.
Thinking about it, I wonder if Ingo would entertain the idea of moving
p->thread_info.cpu into p->thread.cpu. Ingo?
- Ken
^ permalink raw reply [flat|nested] 73+ messages in thread* RE: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (27 preceding siblings ...)
2006-01-24 22:31 ` Chen, Kenneth W
@ 2006-01-24 22:41 ` Brent Casavant
2006-01-24 23:25 ` Chen, Kenneth W
` (35 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Brent Casavant @ 2006-01-24 22:41 UTC (permalink / raw)
To: linux-ia64
On Tue, 24 Jan 2006, Chen, Kenneth W wrote:
> Brent Casavant wrote on Tuesday, January 24, 2006 2:12 PM
> > > On the other hand, it may not be too bad because on kernel exit, flags
> > > field in thread_info is accessed for TIF_* and if kernel exit path is
> > > taken back-to-back with context switch, the net effect maybe small.
> >
> > Actually, IA64_HAS_EXTRA_STATE() reads the TIF_* flags, and is called
> > in __switch_to(), which is in very close proximity to checking last_cpu.
>
> Where??
>
> IA64_HAS_EXTRA_STATE() uses task->thread.flags, not task->thread_info->flags.
> These two flags are in two different places.
Bhaa! You are right of course. My mistake.
Still, other than moving last_cpu to neighbor cpu (or getting them
moved to the thread_struct, as you inquired of Ingo), I'm not sure
what can be done to further mitigate the cacheline impact.
Brent
--
Brent Casavant All music is folk music. I ain't
bcasavan@sgi.com never heard a horse sing a song.
Silicon Graphics, Inc. -- Louis Armstrong
^ permalink raw reply [flat|nested] 73+ messages in thread* RE: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (28 preceding siblings ...)
2006-01-24 22:41 ` Brent Casavant
@ 2006-01-24 23:25 ` Chen, Kenneth W
2006-01-24 23:28 ` Brent Casavant
` (34 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Chen, Kenneth W @ 2006-01-24 23:25 UTC (permalink / raw)
To: linux-ia64
Brent Casavant wrote on Friday, January 20, 2006 12:01 PM
> Regarding the direction Ingo sent me down, and considering what Jack
> said about needing a hook for a future platform, I'm thinking of grabbing
> a bit in task->thread.flags that IA64_HAS_EXTRA_STATE() could detect and
> let ia64_{save,load}_extra() call new machvecs to perform this
> chipset-specific context management. It's a bit overengineered for
> my particular case, but would allow Jack to plug in his work very
> cleanly.
I wonder why you did not continue on this path. I think the best you
can do is to grab a bit in task->thread.flags, at boot time dynamically
set it to 1 for init_task on sn2 platform and 0 for others, invent a
name something like IA64_THREAD_MMIO_SYNC. Then in switch_to(), you do:
If (t->thread.flags & IA64_THREAD_MMIO_SYNC)
platform_switch_from(...)
On non-sgi machine, the test will always false, which address most of
the performance concerns from various people (including me). And on
sn2 platform, it will handle the synchronization in sn2 platform specific
vector.
You can go even one step further that only set the bit for process that
has the MMIO address mapped. So on sn2, you won't pay the extra cost if
it is not required for most of the processes. Something workable?
- Ken
^ permalink raw reply [flat|nested] 73+ messages in thread* Re: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (29 preceding siblings ...)
2006-01-24 23:25 ` Chen, Kenneth W
@ 2006-01-24 23:28 ` Brent Casavant
2006-01-24 23:36 ` Chen, Kenneth W
` (33 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Brent Casavant @ 2006-01-24 23:28 UTC (permalink / raw)
To: linux-ia64
Take 4. Moved "really migrated?" check as late as possible in
__switch_task(), and moved thread_info last_cpu field closer to
flags and cpu fields. Some nomenclature changes as well.
---
On SN2, MMIO writes which are issued from separate processors are not
guaranteed to arrive in any particular order at the IO hardware. When
performing such writes from the kernel this is not a problem, as a
kernel thread will not migrate to another CPU during execution, and
mmiowb() calls can guarantee write ordering when control of the IO
resource is allowed to move between threads.
However, when MMIO writes can be performed from user space (e.g. DRM)
there are no such guarantees and mechanisms, as the process may
context-switch at any time, and may migrate to a different CPU as part
of the switch. For such programs/hardware to operate correctly, it is
required that the MMIO writes from the old CPU be accepted by the IO
hardware before subsequent writes from the new CPU can be issued.
The following patch implements this behavior on SN2 by waiting for a
Shub register to indicate that these writes have been accepted (the
non-SN2 case is a no-op). This is placed in the context switch-in
path, and only performs the wait when the newly scheduled task changes
CPUs.
Signed-off-by: Brent Casavant <bcasavan@sgi.com>
arch/ia64/sn/kernel/setup.c | 5 +++--
arch/ia64/sn/kernel/sn2/sn2_smp.c | 22 +++++++++++++++++++++-
include/asm-ia64/machvec.h | 13 +++++++++++++
include/asm-ia64/machvec_sn2.h | 4 +++-
include/asm-ia64/system.h | 11 +++++++++++
include/asm-ia64/thread_info.h | 2 ++
6 files changed, 53 insertions(+), 4 deletions(-)
diff --git a/arch/ia64/sn/kernel/setup.c b/arch/ia64/sn/kernel/setup.c
index e510dce..a687260 100644
--- a/arch/ia64/sn/kernel/setup.c
+++ b/arch/ia64/sn/kernel/setup.c
@@ -3,7 +3,7 @@
* License. See the file "COPYING" in the main directory of this archive
* for more details.
*
- * Copyright (C) 1999,2001-2005 Silicon Graphics, Inc. All rights reserved.
+ * Copyright (C) 1999,2001-2006 Silicon Graphics, Inc. All rights reserved.
*/
#include <linux/config.h>
@@ -654,7 +654,8 @@ void __init sn_cpu_init(void)
SH2_PIO_WRITE_STATUS_1, SH2_PIO_WRITE_STATUS_3};
u64 *pio;
pio = is_shub1() ? pio1 : pio2;
- pda->pio_write_status_addr = (volatile unsigned long *) LOCAL_MMR_ADDR(pio[slice]);
+ pda->pio_write_status_addr = (volatile unsigned long *)
+ GLOBAL_MMR_ADDR(nasid, pio[slice]);
pda->pio_write_status_val = is_shub1() ? SH_PIO_WRITE_STATUS_PENDING_WRITE_COUNT_MASK : 0;
}
diff --git a/arch/ia64/sn/kernel/sn2/sn2_smp.c b/arch/ia64/sn/kernel/sn2/sn2_smp.c
index 471bbaa..2b59dff 100644
--- a/arch/ia64/sn/kernel/sn2/sn2_smp.c
+++ b/arch/ia64/sn/kernel/sn2/sn2_smp.c
@@ -5,7 +5,7 @@
* License. See the file "COPYING" in the main directory of this archive
* for more details.
*
- * Copyright (C) 2000-2005 Silicon Graphics, Inc. All rights reserved.
+ * Copyright (C) 2000-2006 Silicon Graphics, Inc. All rights reserved.
*/
#include <linux/init.h>
@@ -169,6 +169,26 @@ static inline unsigned long wait_piowc(v
return ws;
}
+/**
+ * sn_migrate - SN-specific task migration actions
+ * @task: Task being migrated to new CPU
+ *
+ * SN2 PIO writes from separate CPUs are not guaranteed to arrive in order.
+ * Context switching user threads which have memory-mapped MMIO may cause
+ * PIOs to issue from seperate CPUs, thus the PIO writes must be drained
+ * from the previous CPU's Shub before execution resumes on the new CPU.
+ */
+void sn_migrate(struct task_struct *task)
+{
+ pda_t *last_pda = pdacpu(task_thread_info(task)->last_cpu);
+ volatile unsigned long *adr = last_pda->pio_write_status_addr;
+ unsigned long val = last_pda->pio_write_status_val;
+
+ /* Drain PIO writes from old CPU's Shub */
+ while ((*adr & SH_PIO_WRITE_STATUS_PENDING_WRITE_COUNT_MASK) != val)
+ cpu_relax();
+}
+
void sn_tlb_migrate_finish(struct mm_struct *mm)
{
if (mm = current->mm)
diff --git a/include/asm-ia64/machvec.h b/include/asm-ia64/machvec.h
index ca5ea99..c3e4ed8 100644
--- a/include/asm-ia64/machvec.h
+++ b/include/asm-ia64/machvec.h
@@ -20,6 +20,7 @@ struct scatterlist;
struct page;
struct mm_struct;
struct pci_bus;
+struct task_struct;
typedef void ia64_mv_setup_t (char **);
typedef void ia64_mv_cpu_init_t (void);
@@ -34,6 +35,7 @@ typedef int ia64_mv_pci_legacy_read_t (s
u8 size);
typedef int ia64_mv_pci_legacy_write_t (struct pci_bus *, u16 port, u32 val,
u8 size);
+typedef void ia64_mv_migrate_t(struct task_struct * task);
/* DMA-mapping interface: */
typedef void ia64_mv_dma_init (void);
@@ -85,6 +87,11 @@ machvec_noop_mm (struct mm_struct *mm)
{
}
+static inline void
+machvec_noop_task (struct task_struct *task)
+{
+}
+
extern void machvec_setup (char **);
extern void machvec_timer_interrupt (int, void *, struct pt_regs *);
extern void machvec_dma_sync_single (struct device *, dma_addr_t, size_t, int);
@@ -146,6 +153,7 @@ extern void machvec_tlb_migrate_finish (
# define platform_readw_relaxed ia64_mv.readw_relaxed
# define platform_readl_relaxed ia64_mv.readl_relaxed
# define platform_readq_relaxed ia64_mv.readq_relaxed
+# define platform_migrate ia64_mv.migrate
# endif
/* __attribute__((__aligned__(16))) is required to make size of the
@@ -194,6 +202,7 @@ struct ia64_machine_vector {
ia64_mv_readw_relaxed_t *readw_relaxed;
ia64_mv_readl_relaxed_t *readl_relaxed;
ia64_mv_readq_relaxed_t *readq_relaxed;
+ ia64_mv_migrate_t *migrate;
} __attribute__((__aligned__(16))); /* align attrib? see above comment */
#define MACHVEC_INIT(name) \
@@ -238,6 +247,7 @@ struct ia64_machine_vector {
platform_readw_relaxed, \
platform_readl_relaxed, \
platform_readq_relaxed, \
+ platform_migrate, \
}
extern struct ia64_machine_vector ia64_mv;
@@ -386,5 +396,8 @@ extern ia64_mv_dma_supported swiotlb_dm
#ifndef platform_readq_relaxed
# define platform_readq_relaxed __ia64_readq_relaxed
#endif
+#ifndef platform_migrate
+# define platform_migrate machvec_noop_task
+#endif
#endif /* _ASM_IA64_MACHVEC_H */
diff --git a/include/asm-ia64/machvec_sn2.h b/include/asm-ia64/machvec_sn2.h
index e1b6cd6..6f0021b 100644
--- a/include/asm-ia64/machvec_sn2.h
+++ b/include/asm-ia64/machvec_sn2.h
@@ -1,5 +1,5 @@
/*
- * Copyright (c) 2002-2003 Silicon Graphics, Inc. All Rights Reserved.
+ * Copyright (c) 2002-2003,2006 Silicon Graphics, Inc. All Rights Reserved.
*
* This program is free software; you can redistribute it and/or modify it
* under the terms of version 2 of the GNU General Public License
@@ -71,6 +71,7 @@ extern ia64_mv_dma_sync_single_for_devic
extern ia64_mv_dma_sync_sg_for_device sn_dma_sync_sg_for_device;
extern ia64_mv_dma_mapping_error sn_dma_mapping_error;
extern ia64_mv_dma_supported sn_dma_supported;
+extern ia64_mv_migrate_t sn_migrate;
/*
* This stuff has dual use!
@@ -120,6 +121,7 @@ extern ia64_mv_dma_supported sn_dma_sup
#define platform_dma_sync_sg_for_device sn_dma_sync_sg_for_device
#define platform_dma_mapping_error sn_dma_mapping_error
#define platform_dma_supported sn_dma_supported
+#define platform_migrate sn_migrate
#include <asm/sn/io.h>
diff --git a/include/asm-ia64/system.h b/include/asm-ia64/system.h
index 80c5a23..fe90ae2 100644
--- a/include/asm-ia64/system.h
+++ b/include/asm-ia64/system.h
@@ -221,12 +221,23 @@ extern void ia64_load_extra (struct task
((t)->thread.flags & (IA64_THREAD_DBG_VALID|IA64_THREAD_PM_VALID) \
|| IS_IA32_PROCESS(task_pt_regs(t)) || PERFMON_IS_SYSWIDE())
+#ifdef CONFIG_SMP
+# define arch_migrate(task) \
+ if (task_cpu(task) != task_thread_info(task)->last_cpu) { \
+ platform_migrate(task); \
+ task_thread_info(task)->last_cpu = task_cpu(task); \
+ }
+#else
+# define arch_migrate(task) do { } while(0)
+#endif
+
#define __switch_to(prev,next,last) do { \
if (IA64_HAS_EXTRA_STATE(prev)) \
ia64_save_extra(prev); \
if (IA64_HAS_EXTRA_STATE(next)) \
ia64_load_extra(next); \
ia64_psr(task_pt_regs(next))->dfh = !ia64_is_local_fpu_owner(next); \
+ arch_migrate(next); \
(last) = ia64_switch_to((next)); \
} while (0)
diff --git a/include/asm-ia64/thread_info.h b/include/asm-ia64/thread_info.h
index 1d6518f..8aec977 100644
--- a/include/asm-ia64/thread_info.h
+++ b/include/asm-ia64/thread_info.h
@@ -26,6 +26,7 @@ struct thread_info {
struct exec_domain *exec_domain;/* execution domain */
__u32 flags; /* thread_info flags (see TIF_*) */
__u32 cpu; /* current CPU */
+ __u32 last_cpu; /* Last CPU thread ran on */
mm_segment_t addr_limit; /* user-level address space limit */
int preempt_count; /* 0=premptable, <0=BUG; will also serve as bh-counter */
struct restart_block restart_block;
@@ -46,6 +47,7 @@ struct thread_info {
.exec_domain = &default_exec_domain, \
.flags = 0, \
.cpu = 0, \
+ .last_cpu = 0, \
.addr_limit = KERNEL_DS, \
.preempt_count = 0, \
.restart_block = { \
^ permalink raw reply related [flat|nested] 73+ messages in thread* RE: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (30 preceding siblings ...)
2006-01-24 23:28 ` Brent Casavant
@ 2006-01-24 23:36 ` Chen, Kenneth W
2006-01-24 23:54 ` Brent Casavant
` (32 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Chen, Kenneth W @ 2006-01-24 23:36 UTC (permalink / raw)
To: linux-ia64
Brent Casavant wrote on Tuesday, January 24, 2006 3:29 PM
> Take 4. Moved "really migrated?" check as late as possible in
> __switch_task(), and moved thread_info last_cpu field closer to
> flags and cpu fields. Some nomenclature changes as well.
I think you should seriously consider my other proposal :-)
- Ken
^ permalink raw reply [flat|nested] 73+ messages in thread* RE: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (31 preceding siblings ...)
2006-01-24 23:36 ` Chen, Kenneth W
@ 2006-01-24 23:54 ` Brent Casavant
2006-01-25 0:10 ` Brent Casavant
` (31 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Brent Casavant @ 2006-01-24 23:54 UTC (permalink / raw)
To: linux-ia64
On Tue, 24 Jan 2006, Chen, Kenneth W wrote:
> Brent Casavant wrote on Friday, January 20, 2006 12:01 PM
> > Regarding the direction Ingo sent me down, and considering what Jack
> > said about needing a hook for a future platform, I'm thinking of grabbing
> > a bit in task->thread.flags that IA64_HAS_EXTRA_STATE() could detect and
> > let ia64_{save,load}_extra() call new machvecs to perform this
> > chipset-specific context management. It's a bit overengineered for
> > my particular case, but would allow Jack to plug in his work very
> > cleanly.
>
> I wonder why you did not continue on this path.
Two main reasons:
1. Requires an arch hook in the main scheduler to set the flag in most
places set_task_cpu() is called. Ingo didn't seem fond of that
(see my original patch and his response).
2. The initialization code gets kind of ugly.
I suppose I could solve the second problem with a machvec that is only
called once, during the contruction of the init task. So really, the
first reason is the bulk of it.
> You can go even one step further that only set the bit for process that
> has the MMIO address mapped. So on sn2, you won't pay the extra cost if
> it is not required for most of the processes. Something workable?
Workable, but requires additions to the mm struct and modifications to
several device drivers. Since migration is relatively infrequent to
begin with, I can't imagine this is worth doing in the abscence of an
observed problem. But I'll keep it in mind for future optimization.
Brent
--
Brent Casavant All music is folk music. I ain't
bcasavan@sgi.com never heard a horse sing a song.
Silicon Graphics, Inc. -- Louis Armstrong
^ permalink raw reply [flat|nested] 73+ messages in thread* RE: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (32 preceding siblings ...)
2006-01-24 23:54 ` Brent Casavant
@ 2006-01-25 0:10 ` Brent Casavant
2006-01-25 0:29 ` Chen, Kenneth W
` (30 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Brent Casavant @ 2006-01-25 0:10 UTC (permalink / raw)
To: linux-ia64
On Tue, 24 Jan 2006, Chen, Kenneth W wrote:
> Brent Casavant wrote on Tuesday, January 24, 2006 3:29 PM
> > Take 4. Moved "really migrated?" check as late as possible in
> > __switch_task(), and moved thread_info last_cpu field closer to
> > flags and cpu fields. Some nomenclature changes as well.
>
> I think you should seriously consider my other proposal :-)
Well... if Ingo is amenable to the arch hook in the scheduler, I'm
OK with changing directions. Again. :-)
The arch hook would be a lot simpler this time -- simply setting
a flag instead of performing the PIO write drain. Not sure if
that's enough to argue in favor of the hook or not.
Truth be told, I have no strong preference either way. I'd just
like to break out of this Brownian motion. :-)
Brent
--
Brent Casavant All music is folk music. I ain't
bcasavan@sgi.com never heard a horse sing a song.
Silicon Graphics, Inc. -- Louis Armstrong
^ permalink raw reply [flat|nested] 73+ messages in thread* RE: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (33 preceding siblings ...)
2006-01-25 0:10 ` Brent Casavant
@ 2006-01-25 0:29 ` Chen, Kenneth W
2006-01-25 6:27 ` Keith Owens
` (29 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Chen, Kenneth W @ 2006-01-25 0:29 UTC (permalink / raw)
To: linux-ia64
Brent Casavant wrote on Tuesday, January 24, 2006 3:55 PM
> > > Regarding the direction Ingo sent me down, and considering what Jack
> > > said about needing a hook for a future platform, I'm thinking of grabbing
> > > a bit in task->thread.flags that IA64_HAS_EXTRA_STATE() could detect and
> > > let ia64_{save,load}_extra() call new machvecs to perform this
> > > chipset-specific context management. It's a bit overengineered for
> > > my particular case, but would allow Jack to plug in his work very
> > > cleanly.
> >
> > I wonder why you did not continue on this path.
>
> Two main reasons:
>
> 1. Requires an arch hook in the main scheduler to set the flag in most
> places set_task_cpu() is called. Ingo didn't seem fond of that
> (see my original patch and his response).
> 2. The initialization code gets kind of ugly.
>
Well, maybe bad wording, I didn't meant to pursue scheduler hook in the
generic code. Instead, idea is morphed into having a thread specific
bit to tell kernel whether it needs to do MMIO synchronization for that
task at context switch. The beauty of it is that for task that doesn't
need to do any of MMIO synchronization, the cost is kept at minimal even
if task migrated.
In the event of task migration, your "take 4" patch will cause a cache
line update, another cache line read for the function pointer, followed
by an empty function call, on *all* non-sn2 platforms. Even on sn2
platform, for process that *doesn't* do MMIO, you are still paying the
cost of accessing shub in sn_migrate(), which can be optimized away. I
hope you understand what I'm talking about. Otherwise, I can do a patch
on top of yours.
- Ken
^ permalink raw reply [flat|nested] 73+ messages in thread* Re: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (34 preceding siblings ...)
2006-01-25 0:29 ` Chen, Kenneth W
@ 2006-01-25 6:27 ` Keith Owens
2006-01-25 9:04 ` Chen, Kenneth W
` (28 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Keith Owens @ 2006-01-25 6:27 UTC (permalink / raw)
To: linux-ia64
Brent Casavant (on Tue, 24 Jan 2006 15:12:48 -0600 (CST)) wrote:
>+void sn_switch_from(struct task_struct *task)
>+{
>+ pda_t *last_pda = pdacpu(task_thread_info(task)->last_cpu);
>+ volatile unsigned long *adr = last_pda->pio_write_status_addr;
>+ unsigned long val = last_pda->pio_write_status_val;
>+
>+ /* Drain PIO writes from old CPU's Shub */
>+ while ((*adr & SH_PIO_WRITE_STATUS_PENDING_WRITE_COUNT_MASK) != val)
>+ cpu_relax();
>+}
cpu_relax() maps to ia64_hint(ia64_hint_pause) which is defined as a
memory clobber, so you do not need to mark adr as volatile. Linus
recommends against relying on volatile in C code, memory barrier
operations like cpu_relax should be used instead.
Alas include/asm-ia64/intel_intrin.h defines ia64_hint() as a no-op.
That needs to be fixed.
^ permalink raw reply [flat|nested] 73+ messages in thread* RE: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (35 preceding siblings ...)
2006-01-25 6:27 ` Keith Owens
@ 2006-01-25 9:04 ` Chen, Kenneth W
2006-01-25 9:24 ` Chen, Kenneth W
` (27 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Chen, Kenneth W @ 2006-01-25 9:04 UTC (permalink / raw)
To: linux-ia64
Brent Casavant wrote on Tuesday, January 24, 2006 4:11 PM
> > Brent Casavant wrote on Tuesday, January 24, 2006 3:29 PM
> > > Take 4. Moved "really migrated?" check as late as possible in
> > > __switch_task(), and moved thread_info last_cpu field closer to
> > > flags and cpu fields. Some nomenclature changes as well.
> >
> > I think you should seriously consider my other proposal :-)
>
> Well... if Ingo is amenable to the arch hook in the scheduler, I'm
> OK with changing directions. Again. :-)
Hmm, I'm not proposing for arch hook at all.
> The arch hook would be a lot simpler this time -- simply setting
> a flag instead of performing the PIO write drain. Not sure if
> that's enough to argue in favor of the hook or not.
>
> Truth be told, I have no strong preference either way. I'd just
> like to break out of this Brownian motion. :-)
This is the "what the heck is he talking about?" patch, on top of
"take 4" patch:
---
Optimize platform_migrate(): call platform specific vector only
if machine requires it.
Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
--- linus-2.6.git/include/asm-ia64/processor.h.orig 2006-01-21 10:58:06.000000000 -0800
+++ linus-2.6.git/include/asm-ia64/processor.h 2006-01-25 01:09:34.169421315 -0800
@@ -50,7 +50,7 @@
#define IA64_THREAD_PM_VALID (__IA64_UL(1) << 2) /* performance registers valid? */
#define IA64_THREAD_UAC_NOPRINT (__IA64_UL(1) << 3) /* don't log unaligned accesses */
#define IA64_THREAD_UAC_SIGBUS (__IA64_UL(1) << 4) /* generate SIGBUS on unaligned acc. */
- /* bit 5 is currently unused */
+#define IA64_THREAD_MIGRATION (__IA64_UL(1) << 5) /* require migration sync at ctx sw */
#define IA64_THREAD_FPEMU_NOPRINT (__IA64_UL(1) << 6) /* don't log any fpswa faults */
#define IA64_THREAD_FPEMU_SIGFPE (__IA64_UL(1) << 7) /* send a SIGFPE for fpswa faults */
--- linus-2.6.git/include/asm-ia64/system.h.orig 2006-01-25 01:14:07.661605464 -0800
+++ linus-2.6.git/include/asm-ia64/system.h 2006-01-25 01:17:36.173321660 -0800
@@ -221,23 +221,12 @@ extern void ia64_load_extra (struct task
((t)->thread.flags & (IA64_THREAD_DBG_VALID|IA64_THREAD_PM_VALID) \
|| IS_IA32_PROCESS(task_pt_regs(t)) || PERFMON_IS_SYSWIDE())
-#ifdef CONFIG_SMP
-# define arch_migrate(task) \
- if (task_cpu(task) != task_thread_info(task)->last_cpu) { \
- platform_migrate(task); \
- task_thread_info(task)->last_cpu = task_cpu(task); \
- }
-#else
-# define arch_migrate(task) do { } while(0)
-#endif
-
#define __switch_to(prev,next,last) do { \
if (IA64_HAS_EXTRA_STATE(prev)) \
ia64_save_extra(prev); \
if (IA64_HAS_EXTRA_STATE(next)) \
ia64_load_extra(next); \
ia64_psr(task_pt_regs(next))->dfh = !ia64_is_local_fpu_owner(next); \
- arch_migrate(next); \
(last) = ia64_switch_to((next)); \
} while (0)
@@ -255,6 +244,8 @@ extern void ia64_load_extra (struct task
__ia64_save_fpu((prev)->thread.fph); \
} \
__switch_to(prev, next, last); \
+ if (unlikely((next)->thread.flags & IA64_THREAD_MIGRATION)) \
+ platform_migrate(next); \
} while (0)
#else
# define switch_to(prev,next,last) __switch_to(prev, next, last)
--- linus-2.6.git/arch/ia64/sn/kernel/sn2/sn2_smp.c.orig 2006-01-25 01:13:56.322738416 -0800
+++ linus-2.6.git/arch/ia64/sn/kernel/sn2/sn2_smp.c 2006-01-25 01:19:27.169414050 -0800
@@ -184,9 +184,13 @@ void sn_migrate(struct task_struct *task
volatile unsigned long *adr = last_pda->pio_write_status_addr;
unsigned long val = last_pda->pio_write_status_val;
+ if (task_cpu(task) = task_thread_info(task)->last_cpu)
+ return;
+
/* Drain PIO writes from old CPU's Shub */
while ((*adr & SH_PIO_WRITE_STATUS_PENDING_WRITE_COUNT_MASK) != val)
cpu_relax();
+ task_thread_info(task)->last_cpu = task_cpu(task);
}
void sn_tlb_migrate_finish(struct mm_struct *mm)
--- linus-2.6.git/arch/ia64/sn/kernel/setup.c.orig 2006-01-25 01:02:30.830559313 -0800
+++ linus-2.6.git/arch/ia64/sn/kernel/setup.c 2006-01-25 01:09:57.794421025 -0800
@@ -496,6 +496,7 @@ void __init sn_setup(char **cmdline_p)
* for sn.
*/
pm_power_off = ia64_sn_power_down;
+ current->thread.flags |= IA64_THREAD_MIGRATION;
}
/**
^ permalink raw reply [flat|nested] 73+ messages in thread* RE: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (36 preceding siblings ...)
2006-01-25 9:04 ` Chen, Kenneth W
@ 2006-01-25 9:24 ` Chen, Kenneth W
2006-01-25 17:04 ` Brent Casavant
` (26 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Chen, Kenneth W @ 2006-01-25 9:24 UTC (permalink / raw)
To: linux-ia64
Brent Casavant wrote on Tuesday, January 24, 2006 3:29 PM
> Take 4. Moved "really migrated?" check as late as possible in
> __switch_task(), and moved thread_info last_cpu field closer to
> flags and cpu fields. Some nomenclature changes as well.
diff --git a/include/asm-ia64/system.h b/include/asm-ia64/system.h
> #define __switch_to(prev,next,last) do { \
> if (IA64_HAS_EXTRA_STATE(prev)) \
> ia64_save_extra(prev); \
> if (IA64_HAS_EXTRA_STATE(next)) \
> ia64_load_extra(next); \
> ia64_psr(task_pt_regs(next))->dfh = !ia64_is_local_fpu_owner(next); \
> + arch_migrate(next); \
> (last) = ia64_switch_to((next)); \
> } while (0)
Thought everyone agreed to call the sync function with the new task context,
(after ia64_switch_to). Something changed which made you changed your mind?
diff --git a/include/asm-ia64/thread_info.h b/include/asm-ia64/thread_info.h
> @@ -46,6 +47,7 @@ struct thread_info {
> .exec_domain = &default_exec_domain, \
> .flags = 0, \
> .cpu = 0, \
> + .last_cpu = 0, \
> .addr_limit = KERNEL_DS, \
> .preempt_count = 0, \
> .restart_block = { \
Looked spurious, init_task has static initializer, which will have zero
initial value automatically.
- Ken
^ permalink raw reply [flat|nested] 73+ messages in thread* RE: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (37 preceding siblings ...)
2006-01-25 9:24 ` Chen, Kenneth W
@ 2006-01-25 17:04 ` Brent Casavant
2006-01-25 17:45 ` Brent Casavant
` (25 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Brent Casavant @ 2006-01-25 17:04 UTC (permalink / raw)
To: linux-ia64
On Wed, 25 Jan 2006, Chen, Kenneth W wrote:
> This is the "what the heck is he talking about?" patch, on top of
> "take 4" patch:
Thank you very much. I get it now.
What I thought you meant was flipping a bit in locations such as
migrate_task() to indicate that this task has migrated since it
last ran. That obviously requird a scheduler hook, and where
I was running into a misunderstanding. Thanks for clarifying it
with code.
A few questions:
> --- linus-2.6.git/include/asm-ia64/system.h.orig 2006-01-25 01:14:07.661605464 -0800
> +++ linus-2.6.git/include/asm-ia64/system.h 2006-01-25 01:17:36.173321660 -0800
> @@ -255,6 +244,8 @@ extern void ia64_load_extra (struct task
> __ia64_save_fpu((prev)->thread.fph); \
> } \
> __switch_to(prev, next, last); \
> + if (unlikely((next)->thread.flags & IA64_THREAD_MIGRATION)) \
> + platform_migrate(next); \
> } while (0)
> #else
> # define switch_to(prev,next,last) __switch_to(prev, next, last)
Would it be better to add this to the IA64_HAS_EXTRA_STATE() check,
and move the call to platform_migrate() into ia64_load_extra()? Or
am I failing to take something into consideration?
Also, yesterday when I moved the platform_migrate() call after
__switch_task() (actually, after ia64_switch_to()) I would receive
kernel panics during boot (the migration threads would die from an
invalid access, swapper shortly thereafter, and finally a "soft lockup"
on swapper). Was I perhaps missing something?
Thanks,
Brent
--
Brent Casavant All music is folk music. I ain't
bcasavan@sgi.com never heard a horse sing a song.
Silicon Graphics, Inc. -- Louis Armstrong
^ permalink raw reply [flat|nested] 73+ messages in thread* RE: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (38 preceding siblings ...)
2006-01-25 17:04 ` Brent Casavant
@ 2006-01-25 17:45 ` Brent Casavant
2006-01-25 17:48 ` Brent Casavant
` (24 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Brent Casavant @ 2006-01-25 17:45 UTC (permalink / raw)
To: linux-ia64
On Wed, 25 Jan 2006, Chen, Kenneth W wrote:
> Thought everyone agreed to call the sync function with the new task context,
> (after ia64_switch_to). Something changed which made you changed your mind?
Yeah. I mentioned this in my other message, but I kept running into
invalid accesses (and migration/swapper threads dying) in sn_migrate()
when the sync was done after ia64_switch_to(). Any idea what I might
be doing wrong? Perhaps I need to call barrier() first?
> diff --git a/include/asm-ia64/thread_info.h b/include/asm-ia64/thread_info.h
> > @@ -46,6 +47,7 @@ struct thread_info {
> > .exec_domain = &default_exec_domain, \
> > .flags = 0, \
> > .cpu = 0, \
> > + .last_cpu = 0, \
> > .addr_limit = KERNEL_DS, \
> > .preempt_count = 0, \
> > .restart_block = { \
>
> Looked spurious, init_task has static initializer, which will have zero
> initial value automatically.
I was just following the existing mechanism in the code (obviously
I'm not the only one to add "= 0" in here.
I can submit a separate cleanup patch for this and quite a bit of other
code if you'd like. I've run into all sorts of 80-column/etc issues
while working on this, but didn't want to mix cleanup with meatier
changes.
Brent
--
Brent Casavant All music is folk music. I ain't
bcasavan@sgi.com never heard a horse sing a song.
Silicon Graphics, Inc. -- Louis Armstrong
^ permalink raw reply [flat|nested] 73+ messages in thread* Re: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (39 preceding siblings ...)
2006-01-25 17:45 ` Brent Casavant
@ 2006-01-25 17:48 ` Brent Casavant
2006-01-25 19:01 ` Chen, Kenneth W
` (23 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Brent Casavant @ 2006-01-25 17:48 UTC (permalink / raw)
To: linux-ia64
On Wed, 25 Jan 2006, Keith Owens wrote:
> Brent Casavant (on Tue, 24 Jan 2006 15:12:48 -0600 (CST)) wrote:
> >+void sn_switch_from(struct task_struct *task)
> >+{
> >+ pda_t *last_pda = pdacpu(task_thread_info(task)->last_cpu);
> >+ volatile unsigned long *adr = last_pda->pio_write_status_addr;
> >+ unsigned long val = last_pda->pio_write_status_val;
> >+
> >+ /* Drain PIO writes from old CPU's Shub */
> >+ while ((*adr & SH_PIO_WRITE_STATUS_PENDING_WRITE_COUNT_MASK) != val)
> >+ cpu_relax();
> >+}
>
> cpu_relax() maps to ia64_hint(ia64_hint_pause) which is defined as a
> memory clobber, so you do not need to mark adr as volatile. Linus
> recommends against relying on volatile in C code, memory barrier
> operations like cpu_relax should be used instead.
Hmm. Does that mean that the SN mmiowb() implementation is incorrect
as well? Because this was pretty much a cut-n-paste job from there.
I can probably fix that (mmiowb()) in the near future, as there's
some upcoming work needed in that area.
> Alas include/asm-ia64/intel_intrin.h defines ia64_hint() as a no-op.
Err, so we should hold off on fixing this plus mmiowb() until the
intrinsic is fixed, correct?
Brent
--
Brent Casavant All music is folk music. I ain't
bcasavan@sgi.com never heard a horse sing a song.
Silicon Graphics, Inc. -- Louis Armstrong
^ permalink raw reply [flat|nested] 73+ messages in thread* RE: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (40 preceding siblings ...)
2006-01-25 17:48 ` Brent Casavant
@ 2006-01-25 19:01 ` Chen, Kenneth W
2006-01-25 19:15 ` Brent Casavant
` (22 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Chen, Kenneth W @ 2006-01-25 19:01 UTC (permalink / raw)
To: linux-ia64
Brent Casavant wrote on Wednesday, January 25, 2006 9:04 AM
> A few questions:
>
> > --- linus-2.6.git/include/asm-ia64/system.h.orig 2006-01-25
01:14:07.661605464 -0800
> > +++ linus-2.6.git/include/asm-ia64/system.h 2006-01-25
01:17:36.173321660 -0800
> > @@ -255,6 +244,8 @@ extern void ia64_load_extra (struct task
> > __ia64_save_fpu((prev)->thread.fph);
\
> > }
\
> > __switch_to(prev, next, last);
\
> > + if (unlikely((next)->thread.flags & IA64_THREAD_MIGRATION))
\
> > + platform_migrate(next);
\
> > } while (0)
> > #else
> > # define switch_to(prev,next,last) __switch_to(prev, next, last)
>
> Would it be better to add this to the IA64_HAS_EXTRA_STATE() check,
> and move the call to platform_migrate() into ia64_load_extra()? Or
> am I failing to take something into consideration?
I suppose, I don't have strong opinion where it is being called.
Adding to ia64_load_extra(), you might need #ifdef CONFIG_SMP, of
course you can just sneak in a plain version unless people yell at
you for slowing their UP performance (I won't yell at you for UP).
> Also, yesterday when I moved the platform_migrate() call after
> __switch_task() (actually, after ia64_switch_to()) I would receive
> kernel panics during boot (the migration threads would die from an
> invalid access, swapper shortly thereafter, and finally a "soft
lockup"
> on swapper). Was I perhaps missing something?
I can't immediately see why it won't work on sn2. It works for me
on a Intel tiger ia64 machine.
- Ken
^ permalink raw reply [flat|nested] 73+ messages in thread* RE: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (41 preceding siblings ...)
2006-01-25 19:01 ` Chen, Kenneth W
@ 2006-01-25 19:15 ` Brent Casavant
2006-01-25 19:43 ` Jack Steiner
` (21 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Brent Casavant @ 2006-01-25 19:15 UTC (permalink / raw)
To: linux-ia64
On Wed, 25 Jan 2006, Chen, Kenneth W wrote:
> I suppose, I don't have strong opinion where it is being called.
> Adding to ia64_load_extra(), you might need #ifdef CONFIG_SMP, of
> course you can just sneak in a plain version unless people yell at
> you for slowing their UP performance (I won't yell at you for UP).
Hmm. I just thought of a good reason. Adding it to IA64_HAS_EXTRA_STATE
will cause ia64_{load,save}_extra to be invoked with every single
context switch on SN2, which I don't think we want to do. Better to
leave it a separate test. We may want to revisit that in the future
if/when IA64_THREAD_MIGRATION indicates something finer-grained than
"Is/isn't SN2", but for now it's the better solution.
> > __switch_task() (actually, after ia64_switch_to()) I would receive
> > kernel panics during boot (the migration threads would die from an
> > invalid access, swapper shortly thereafter, and finally a "soft
> lockup"
> > on swapper). Was I perhaps missing something?
>
> I can't immediately see why it won't work on sn2. It works for me
> on a Intel tiger ia64 machine.
OK, I'll dig deeper into it. I suspect ia64_switch_to() is changing
something important underneath us (e.g. a value saved on the stack,
which would tend to change when we switch stacks).
Brent
--
Brent Casavant All music is folk music. I ain't
bcasavan@sgi.com never heard a horse sing a song.
Silicon Graphics, Inc. -- Louis Armstrong
^ permalink raw reply [flat|nested] 73+ messages in thread* Re: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (42 preceding siblings ...)
2006-01-25 19:15 ` Brent Casavant
@ 2006-01-25 19:43 ` Jack Steiner
2006-01-25 22:49 ` Brent Casavant
` (20 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Jack Steiner @ 2006-01-25 19:43 UTC (permalink / raw)
To: linux-ia64
On Wed, Jan 25, 2006 at 01:15:49PM -0600, Brent Casavant wrote:
> On Wed, 25 Jan 2006, Chen, Kenneth W wrote:
>
> > I suppose, I don't have strong opinion where it is being called.
> > Adding to ia64_load_extra(), you might need #ifdef CONFIG_SMP, of
> > course you can just sneak in a plain version unless people yell at
> > you for slowing their UP performance (I won't yell at you for UP).
When the dust settles on this patch, I'd like to investigate whether the TLB
flushing that is done on SN platforms via the scheduler migration callout in
set_cpus_allowed() (see tlb_migrate_finish()) could be done using your
new mechanism. Your new mechanism looks better suited to do what is required
& will result in better performance.
Note that I'm NOT suggesting that tlb_migrate_finish() can be
used for what you need - it is called only on explicit user migrations
via set_cpus_allowed().
--
Jack
^ permalink raw reply [flat|nested] 73+ messages in thread* RE: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (43 preceding siblings ...)
2006-01-25 19:43 ` Jack Steiner
@ 2006-01-25 22:49 ` Brent Casavant
2006-01-25 23:09 ` Brent Casavant
` (19 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Brent Casavant @ 2006-01-25 22:49 UTC (permalink / raw)
To: linux-ia64
On Wed, 25 Jan 2006, Chen, Kenneth W wrote:
> Brent Casavant wrote on Wednesday, January 25, 2006 9:04 AM
>
> > Also, yesterday when I moved the platform_migrate() call after
> > __switch_task() (actually, after ia64_switch_to()) I would receive
> > kernel panics during boot (the migration threads would die from an
> > invalid access, swapper shortly thereafter, and finally a "soft
> lockup"
> > on swapper). Was I perhaps missing something?
>
> I can't immediately see why it won't work on sn2. It works for me
> on a Intel tiger ia64 machine.
Hmm. I have only scant evidence to go on from the panic output,
but it looks like "current" (r13) is not equal to the value of "next"
by the time we get into sn_migrate(). One example run apparently
has currentà000030079d8000, nextà00003015990000.
While this shouldn't cause a problem as long as "next" is valid,
it is unexpected and makes me wonder if my understanding of the
code is correct. Unfortunately using SGI's simulator sheds no
light on the situation, as the code doesn't fail there.
Also loading cpu and last_cpu from next->thread_info for the migration/1
thread, I see ASCII data in the lower 32 bits. So either the data in
next->thread_info is invalid, or "next" itself is invalid. This gets
us into trouble when pdacpu() is called (or maybe when we dereference
the result).
Still staring at it...
Brent
--
Brent Casavant All music is folk music. I ain't
bcasavan@sgi.com never heard a horse sing a song.
Silicon Graphics, Inc. -- Louis Armstrong
^ permalink raw reply [flat|nested] 73+ messages in thread* RE: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (44 preceding siblings ...)
2006-01-25 22:49 ` Brent Casavant
@ 2006-01-25 23:09 ` Brent Casavant
2006-01-25 23:49 ` Brent Casavant
` (18 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Brent Casavant @ 2006-01-25 23:09 UTC (permalink / raw)
To: linux-ia64
On Wed, 25 Jan 2006, Brent Casavant wrote:
> Hmm. I have only scant evidence to go on from the panic output,
> but it looks like "current" (r13) is not equal to the value of "next"
> by the time we get into sn_migrate(). One example run apparently
> has currentà000030079d8000, nextà00003015990000.
Got it! We're context switching. "next" became "current". No
telling what on earth "next's next" was. So after calling
ia64_switch_to(next), I have to refer to "current" in order to
access what was "next" before the call. My brain's all knotted up.
Take 5 following soon.
Brent
--
Brent Casavant All music is folk music. I ain't
bcasavan@sgi.com never heard a horse sing a song.
Silicon Graphics, Inc. -- Louis Armstrong
^ permalink raw reply [flat|nested] 73+ messages in thread* RE: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (45 preceding siblings ...)
2006-01-25 23:09 ` Brent Casavant
@ 2006-01-25 23:49 ` Brent Casavant
2006-01-25 23:56 ` Chen, Kenneth W
` (17 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Brent Casavant @ 2006-01-25 23:49 UTC (permalink / raw)
To: linux-ia64
Take 5. Incorporated Ken's patch for a thread flag, accomodated
"next" changing identity after context switch, removed volatile
specifier from sn_migrate() "adr" variable, and changed/moved a
few bits of code in/between sn_migrate() and switch_to() for better
performance.
---
On SN2, MMIO writes which are issued from separate processors are not
guaranteed to arrive in any particular order at the IO hardware. When
performing such writes from the kernel this is not a problem, as a
kernel thread will not migrate to another CPU during execution, and
mmiowb() calls can guarantee write ordering when control of the IO
resource is allowed to move between threads.
However, when MMIO writes can be performed from user space (e.g. DRM)
there are no such guarantees and mechanisms, as the process may
context-switch at any time, and may migrate to a different CPU as part
of the switch. For such programs/hardware to operate correctly, it is
required that the MMIO writes from the old CPU be accepted by the IO
hardware before subsequent writes from the new CPU can be issued.
The following patch implements this behavior on SN2 by waiting for a
Shub register to indicate that these writes have been accepted (the
non-SN2 case is a no-op). This is placed in the context switch-in
path, and only performs the wait when the newly scheduled task changes
CPUs.
Signed-off-by: Brent Casavant <bcasavan@sgi.com>
Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Signed-off-by: Brent Casavant <bcasavan@sgi.com>
arch/ia64/sn/kernel/setup.c | 6 ++++--
arch/ia64/sn/kernel/sn2/sn2_smp.c | 22 +++++++++++++++++++++-
include/asm-ia64/machvec.h | 13 +++++++++++++
include/asm-ia64/machvec_sn2.h | 4 +++-
include/asm-ia64/processor.h | 2 +-
include/asm-ia64/system.h | 6 ++++++
include/asm-ia64/thread_info.h | 1 +
7 files changed, 49 insertions(+), 5 deletions(-)
diff --git a/arch/ia64/sn/kernel/setup.c b/arch/ia64/sn/kernel/setup.c
index e510dce..06f08b3 100644
--- a/arch/ia64/sn/kernel/setup.c
+++ b/arch/ia64/sn/kernel/setup.c
@@ -3,7 +3,7 @@
* License. See the file "COPYING" in the main directory of this archive
* for more details.
*
- * Copyright (C) 1999,2001-2005 Silicon Graphics, Inc. All rights reserved.
+ * Copyright (C) 1999,2001-2006 Silicon Graphics, Inc. All rights reserved.
*/
#include <linux/config.h>
@@ -496,6 +496,7 @@ void __init sn_setup(char **cmdline_p)
* for sn.
*/
pm_power_off = ia64_sn_power_down;
+ current->thread.flags |= IA64_THREAD_MIGRATION;
}
/**
@@ -654,7 +655,8 @@ void __init sn_cpu_init(void)
SH2_PIO_WRITE_STATUS_1, SH2_PIO_WRITE_STATUS_3};
u64 *pio;
pio = is_shub1() ? pio1 : pio2;
- pda->pio_write_status_addr = (volatile unsigned long *) LOCAL_MMR_ADDR(pio[slice]);
+ pda->pio_write_status_addr = (volatile unsigned long *)
+ GLOBAL_MMR_ADDR(nasid, pio[slice]);
pda->pio_write_status_val = is_shub1() ? SH_PIO_WRITE_STATUS_PENDING_WRITE_COUNT_MASK : 0;
}
diff --git a/arch/ia64/sn/kernel/sn2/sn2_smp.c b/arch/ia64/sn/kernel/sn2/sn2_smp.c
index 471bbaa..87f92b1 100644
--- a/arch/ia64/sn/kernel/sn2/sn2_smp.c
+++ b/arch/ia64/sn/kernel/sn2/sn2_smp.c
@@ -5,7 +5,7 @@
* License. See the file "COPYING" in the main directory of this archive
* for more details.
*
- * Copyright (C) 2000-2005 Silicon Graphics, Inc. All rights reserved.
+ * Copyright (C) 2000-2006 Silicon Graphics, Inc. All rights reserved.
*/
#include <linux/init.h>
@@ -169,6 +169,26 @@ static inline unsigned long wait_piowc(v
return ws;
}
+/**
+ * sn_migrate - SN-specific task migration actions
+ * @task: Task being migrated to new CPU
+ *
+ * SN2 PIO writes from separate CPUs are not guaranteed to arrive in order.
+ * Context switching user threads which have memory-mapped MMIO may cause
+ * PIOs to issue from seperate CPUs, thus the PIO writes must be drained
+ * from the previous CPU's Shub before execution resumes on the new CPU.
+ */
+void sn_migrate(struct task_struct *task)
+{
+ pda_t *last_pda = pdacpu(task_thread_info(task)->last_cpu);
+ unsigned long *adr = last_pda->pio_write_status_addr;;
+ unsigned long val = last_pda->pio_write_status_val;
+
+ /* Drain PIO writes from old CPU's Shub */
+ while (unlikely((*adr & SH_PIO_WRITE_STATUS_PENDING_WRITE_COUNT_MASK) != val))
+ cpu_relax();
+}
+
void sn_tlb_migrate_finish(struct mm_struct *mm)
{
if (mm = current->mm)
diff --git a/include/asm-ia64/machvec.h b/include/asm-ia64/machvec.h
index ca5ea99..c3e4ed8 100644
--- a/include/asm-ia64/machvec.h
+++ b/include/asm-ia64/machvec.h
@@ -20,6 +20,7 @@ struct scatterlist;
struct page;
struct mm_struct;
struct pci_bus;
+struct task_struct;
typedef void ia64_mv_setup_t (char **);
typedef void ia64_mv_cpu_init_t (void);
@@ -34,6 +35,7 @@ typedef int ia64_mv_pci_legacy_read_t (s
u8 size);
typedef int ia64_mv_pci_legacy_write_t (struct pci_bus *, u16 port, u32 val,
u8 size);
+typedef void ia64_mv_migrate_t(struct task_struct * task);
/* DMA-mapping interface: */
typedef void ia64_mv_dma_init (void);
@@ -85,6 +87,11 @@ machvec_noop_mm (struct mm_struct *mm)
{
}
+static inline void
+machvec_noop_task (struct task_struct *task)
+{
+}
+
extern void machvec_setup (char **);
extern void machvec_timer_interrupt (int, void *, struct pt_regs *);
extern void machvec_dma_sync_single (struct device *, dma_addr_t, size_t, int);
@@ -146,6 +153,7 @@ extern void machvec_tlb_migrate_finish (
# define platform_readw_relaxed ia64_mv.readw_relaxed
# define platform_readl_relaxed ia64_mv.readl_relaxed
# define platform_readq_relaxed ia64_mv.readq_relaxed
+# define platform_migrate ia64_mv.migrate
# endif
/* __attribute__((__aligned__(16))) is required to make size of the
@@ -194,6 +202,7 @@ struct ia64_machine_vector {
ia64_mv_readw_relaxed_t *readw_relaxed;
ia64_mv_readl_relaxed_t *readl_relaxed;
ia64_mv_readq_relaxed_t *readq_relaxed;
+ ia64_mv_migrate_t *migrate;
} __attribute__((__aligned__(16))); /* align attrib? see above comment */
#define MACHVEC_INIT(name) \
@@ -238,6 +247,7 @@ struct ia64_machine_vector {
platform_readw_relaxed, \
platform_readl_relaxed, \
platform_readq_relaxed, \
+ platform_migrate, \
}
extern struct ia64_machine_vector ia64_mv;
@@ -386,5 +396,8 @@ extern ia64_mv_dma_supported swiotlb_dm
#ifndef platform_readq_relaxed
# define platform_readq_relaxed __ia64_readq_relaxed
#endif
+#ifndef platform_migrate
+# define platform_migrate machvec_noop_task
+#endif
#endif /* _ASM_IA64_MACHVEC_H */
diff --git a/include/asm-ia64/machvec_sn2.h b/include/asm-ia64/machvec_sn2.h
index e1b6cd6..6f0021b 100644
--- a/include/asm-ia64/machvec_sn2.h
+++ b/include/asm-ia64/machvec_sn2.h
@@ -1,5 +1,5 @@
/*
- * Copyright (c) 2002-2003 Silicon Graphics, Inc. All Rights Reserved.
+ * Copyright (c) 2002-2003,2006 Silicon Graphics, Inc. All Rights Reserved.
*
* This program is free software; you can redistribute it and/or modify it
* under the terms of version 2 of the GNU General Public License
@@ -71,6 +71,7 @@ extern ia64_mv_dma_sync_single_for_devic
extern ia64_mv_dma_sync_sg_for_device sn_dma_sync_sg_for_device;
extern ia64_mv_dma_mapping_error sn_dma_mapping_error;
extern ia64_mv_dma_supported sn_dma_supported;
+extern ia64_mv_migrate_t sn_migrate;
/*
* This stuff has dual use!
@@ -120,6 +121,7 @@ extern ia64_mv_dma_supported sn_dma_sup
#define platform_dma_sync_sg_for_device sn_dma_sync_sg_for_device
#define platform_dma_mapping_error sn_dma_mapping_error
#define platform_dma_supported sn_dma_supported
+#define platform_migrate sn_migrate
#include <asm/sn/io.h>
diff --git a/include/asm-ia64/processor.h b/include/asm-ia64/processor.h
index 09b9902..8eabfec 100644
--- a/include/asm-ia64/processor.h
+++ b/include/asm-ia64/processor.h
@@ -50,7 +50,7 @@
#define IA64_THREAD_PM_VALID (__IA64_UL(1) << 2) /* performance registers valid? */
#define IA64_THREAD_UAC_NOPRINT (__IA64_UL(1) << 3) /* don't log unaligned accesses */
#define IA64_THREAD_UAC_SIGBUS (__IA64_UL(1) << 4) /* generate SIGBUS on unaligned acc. */
- /* bit 5 is currently unused */
+#define IA64_THREAD_MIGRATION (__IA64_UL(1) << 5) /* require migration sync at ctx sw */
#define IA64_THREAD_FPEMU_NOPRINT (__IA64_UL(1) << 6) /* don't log any fpswa faults */
#define IA64_THREAD_FPEMU_SIGFPE (__IA64_UL(1) << 7) /* send a SIGFPE for fpswa faults */
diff --git a/include/asm-ia64/system.h b/include/asm-ia64/system.h
index 80c5a23..f382dea 100644
--- a/include/asm-ia64/system.h
+++ b/include/asm-ia64/system.h
@@ -244,6 +244,12 @@ extern void ia64_load_extra (struct task
__ia64_save_fpu((prev)->thread.fph); \
} \
__switch_to(prev, next, last); \
+ /* "next" in old context is "current" in new context */ \
+ if (unlikely((current->thread.flags & IA64_THREAD_MIGRATION) && \
+ (task_cpu(current) != task_thread_info(current)->last_cpu))) { \
+ platform_migrate(current); \
+ task_thread_info(current)->last_cpu = task_cpu(current); \
+ } \
} while (0)
#else
# define switch_to(prev,next,last) __switch_to(prev, next, last)
diff --git a/include/asm-ia64/thread_info.h b/include/asm-ia64/thread_info.h
index 1d6518f..81641a6 100644
--- a/include/asm-ia64/thread_info.h
+++ b/include/asm-ia64/thread_info.h
@@ -26,6 +26,7 @@ struct thread_info {
struct exec_domain *exec_domain;/* execution domain */
__u32 flags; /* thread_info flags (see TIF_*) */
__u32 cpu; /* current CPU */
+ __u32 last_cpu; /* Last CPU thread ran on */
mm_segment_t addr_limit; /* user-level address space limit */
int preempt_count; /* 0=premptable, <0=BUG; will also serve as bh-counter */
struct restart_block restart_block;
^ permalink raw reply related [flat|nested] 73+ messages in thread* RE: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (46 preceding siblings ...)
2006-01-25 23:49 ` Brent Casavant
@ 2006-01-25 23:56 ` Chen, Kenneth W
2006-01-26 1:06 ` Luck, Tony
` (16 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Chen, Kenneth W @ 2006-01-25 23:56 UTC (permalink / raw)
To: linux-ia64
Brent Casavant wrote on Wednesday, January 25, 2006 3:10 PM
> On Wed, 25 Jan 2006, Brent Casavant wrote:
>
> > Hmm. I have only scant evidence to go on from the panic output,
> > but it looks like "current" (r13) is not equal to the value of "next"
> > by the time we get into sn_migrate(). One example run apparently
> > has currentà000030079d8000, nextà00003015990000.
>
> Got it! We're context switching. "next" became "current". No
> telling what on earth "next's next" was. So after calling
> ia64_switch_to(next), I have to refer to "current" in order to
> access what was "next" before the call. My brain's all knotted up.
Oh my gosh, what a nasty bug !! Thanks for finding it. "next's next"
in the "next" context is the task that "next" was previously context
switched to. Enough mind-boggling wording :-(
p1 switch to p2, then switch to p3 and p3 switched to p1, after
ia64_switch_to, p1 next is pointing to p2. The kernel would die and
fall flat on its face in no time. I guess I'm overly lucky by not
deriving anything pointer value read from thread_info->cpu/last_cpu.
- Ken
^ permalink raw reply [flat|nested] 73+ messages in thread* RE: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (47 preceding siblings ...)
2006-01-25 23:56 ` Chen, Kenneth W
@ 2006-01-26 1:06 ` Luck, Tony
2006-01-26 1:31 ` Prarit Bhargava
` (15 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Luck, Tony @ 2006-01-26 1:06 UTC (permalink / raw)
To: linux-ia64
Getting real close. Now I'm down to nitpicking the format
of the e-mail :-)
Andrew Morton's perfect patch document:
http://www.zipworld.com.au/~akpm/linux/patches/stuff/tpp.txt
says that the changelog comes first, and the random commentary
comes after the "---" line (see paragraph "g"). You have this
backwards, so git attempted to use the wrong bit of text as the
changelog.
> Signed-off-by: Brent Casavant <bcasavan@sgi.com>
> Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
> Signed-off-by: Brent Casavant <bcasavan@sgi.com>
Perhaps a little presumptious to assume that Ken is ok with
the combined version of his part of this patch (ok, I checked
with Ken, and as it happens he is ok with it). One signoff from
you would be sufficient.
And now an actual (albeit trivial) problem. I see a couple of
warnings when building sn2/generic kernels from the new code
that you added:
arch/ia64/sn/kernel/sn2/sn2_smp.c:184: warning: initialization discards qualifiers from pointer target type
arch/ia64/sn/kernel/sn2/sn2_smp.c:185: warning: ISO C90 forbids mixed declarations and code
(compiling with gcc 3.4.3)
-Tony
^ permalink raw reply [flat|nested] 73+ messages in thread* Re: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (48 preceding siblings ...)
2006-01-26 1:06 ` Luck, Tony
@ 2006-01-26 1:31 ` Prarit Bhargava
2006-01-26 2:43 ` Keith Owens
` (14 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Prarit Bhargava @ 2006-01-26 1:31 UTC (permalink / raw)
To: linux-ia64
Hi Brent,
Just a quick CodingStyle review.
P.
> diff --git a/arch/ia64/sn/kernel/setup.c b/arch/ia64/sn/kernel/setup.c
> index e510dce..06f08b3 100644
> --- a/arch/ia64/sn/kernel/setup.c
> +++ b/arch/ia64/sn/kernel/setup.c
> @@ -3,7 +3,7 @@
> * License. See the file "COPYING" in the main directory of this archive
> * for more details.
> *
> - * Copyright (C) 1999,2001-2005 Silicon Graphics, Inc. All rights reserved.
> + * Copyright (C) 1999,2001-2006 Silicon Graphics, Inc. All rights reserved.
> */
>
> #include <linux/config.h>
> @@ -496,6 +496,7 @@ void __init sn_setup(char **cmdline_p)
> * for sn.
> */
> pm_power_off = ia64_sn_power_down;
> + current->thread.flags |= IA64_THREAD_MIGRATION;
> }
>
> /**
> @@ -654,7 +655,8 @@ void __init sn_cpu_init(void)
> SH2_PIO_WRITE_STATUS_1, SH2_PIO_WRITE_STATUS_3};
> u64 *pio;
> pio = is_shub1() ? pio1 : pio2;
> - pda->pio_write_status_addr = (volatile unsigned long *) LOCAL_MMR_ADDR(pio[slice]);
> + pda->pio_write_status_addr = (volatile unsigned long *)
> + GLOBAL_MMR_ADDR(nasid, pio[slice]);
alignment
> pda->pio_write_status_val = is_shub1() ? SH_PIO_WRITE_STATUS_PENDING_WRITE_COUNT_MASK : 0;
> }
>
> diff --git a/arch/ia64/sn/kernel/sn2/sn2_smp.c b/arch/ia64/sn/kernel/sn2/sn2_smp.c
> index 471bbaa..87f92b1 100644
> --- a/arch/ia64/sn/kernel/sn2/sn2_smp.c
> +++ b/arch/ia64/sn/kernel/sn2/sn2_smp.c
> @@ -5,7 +5,7 @@
> * License. See the file "COPYING" in the main directory of this archive
> * for more details.
> *
> - * Copyright (C) 2000-2005 Silicon Graphics, Inc. All rights reserved.
> + * Copyright (C) 2000-2006 Silicon Graphics, Inc. All rights reserved.
> */
>
> #include <linux/init.h>
> @@ -169,6 +169,26 @@ static inline unsigned long wait_piowc(v
> return ws;
> }
>
> +/**
> + * sn_migrate - SN-specific task migration actions
> + * @task: Task being migrated to new CPU
> + *
> + * SN2 PIO writes from separate CPUs are not guaranteed to arrive in order.
> + * Context switching user threads which have memory-mapped MMIO may cause
> + * PIOs to issue from seperate CPUs, thus the PIO writes must be drained
> + * from the previous CPU's Shub before execution resumes on the new CPU.
> + */
> +void sn_migrate(struct task_struct *task)
> +{
> + pda_t *last_pda = pdacpu(task_thread_info(task)->last_cpu);
> + unsigned long *adr = last_pda->pio_write_status_addr;;
extra semi-colon -- leads to Tony's compile error.
> + unsigned long val = last_pda->pio_write_status_val;
> +
> + /* Drain PIO writes from old CPU's Shub */
> + while (unlikely((*adr & SH_PIO_WRITE_STATUS_PENDING_WRITE_COUNT_MASK) != val))
80 columns
> + cpu_relax();
> +}
> +
> void sn_tlb_migrate_finish(struct mm_struct *mm)
> {
> if (mm = current->mm)
> diff --git a/include/asm-ia64/machvec.h b/include/asm-ia64/machvec.h
> index ca5ea99..c3e4ed8 100644
> --- a/include/asm-ia64/machvec.h
> +++ b/include/asm-ia64/machvec.h
> @@ -20,6 +20,7 @@ struct scatterlist;
> struct page;
> struct mm_struct;
> struct pci_bus;
> +struct task_struct;
>
> typedef void ia64_mv_setup_t (char **);
> typedef void ia64_mv_cpu_init_t (void);
> @@ -34,6 +35,7 @@ typedef int ia64_mv_pci_legacy_read_t (s
> u8 size);
> typedef int ia64_mv_pci_legacy_write_t (struct pci_bus *, u16 port, u32 val,
> u8 size);
> +typedef void ia64_mv_migrate_t(struct task_struct * task);
>
> /* DMA-mapping interface: */
> typedef void ia64_mv_dma_init (void);
> @@ -85,6 +87,11 @@ machvec_noop_mm (struct mm_struct *mm)
> {
> }
>
> +static inline void
> +machvec_noop_task (struct task_struct *task)
> +{
> +}
> +
> extern void machvec_setup (char **);
> extern void machvec_timer_interrupt (int, void *, struct pt_regs *);
> extern void machvec_dma_sync_single (struct device *, dma_addr_t, size_t, int);
> @@ -146,6 +153,7 @@ extern void machvec_tlb_migrate_finish (
> # define platform_readw_relaxed ia64_mv.readw_relaxed
> # define platform_readl_relaxed ia64_mv.readl_relaxed
> # define platform_readq_relaxed ia64_mv.readq_relaxed
> +# define platform_migrate ia64_mv.migrate
> # endif
>
> /* __attribute__((__aligned__(16))) is required to make size of the
> @@ -194,6 +202,7 @@ struct ia64_machine_vector {
> ia64_mv_readw_relaxed_t *readw_relaxed;
> ia64_mv_readl_relaxed_t *readl_relaxed;
> ia64_mv_readq_relaxed_t *readq_relaxed;
> + ia64_mv_migrate_t *migrate;
> } __attribute__((__aligned__(16))); /* align attrib? see above comment */
>
> #define MACHVEC_INIT(name) \
> @@ -238,6 +247,7 @@ struct ia64_machine_vector {
> platform_readw_relaxed, \
> platform_readl_relaxed, \
> platform_readq_relaxed, \
> + platform_migrate, \
> }
>
> extern struct ia64_machine_vector ia64_mv;
> @@ -386,5 +396,8 @@ extern ia64_mv_dma_supported swiotlb_dm
> #ifndef platform_readq_relaxed
> # define platform_readq_relaxed __ia64_readq_relaxed
> #endif
> +#ifndef platform_migrate
> +# define platform_migrate machvec_noop_task
> +#endif
>
> #endif /* _ASM_IA64_MACHVEC_H */
> diff --git a/include/asm-ia64/machvec_sn2.h b/include/asm-ia64/machvec_sn2.h
> index e1b6cd6..6f0021b 100644
> --- a/include/asm-ia64/machvec_sn2.h
> +++ b/include/asm-ia64/machvec_sn2.h
> @@ -1,5 +1,5 @@
> /*
> - * Copyright (c) 2002-2003 Silicon Graphics, Inc. All Rights Reserved.
> + * Copyright (c) 2002-2003,2006 Silicon Graphics, Inc. All Rights Reserved.
> *
> * This program is free software; you can redistribute it and/or modify it
> * under the terms of version 2 of the GNU General Public License
> @@ -71,6 +71,7 @@ extern ia64_mv_dma_sync_single_for_devic
> extern ia64_mv_dma_sync_sg_for_device sn_dma_sync_sg_for_device;
> extern ia64_mv_dma_mapping_error sn_dma_mapping_error;
> extern ia64_mv_dma_supported sn_dma_supported;
> +extern ia64_mv_migrate_t sn_migrate;
>
> /*
> * This stuff has dual use!
> @@ -120,6 +121,7 @@ extern ia64_mv_dma_supported sn_dma_sup
> #define platform_dma_sync_sg_for_device sn_dma_sync_sg_for_device
> #define platform_dma_mapping_error sn_dma_mapping_error
> #define platform_dma_supported sn_dma_supported
> +#define platform_migrate sn_migrate
>
> #include <asm/sn/io.h>
>
> diff --git a/include/asm-ia64/processor.h b/include/asm-ia64/processor.h
> index 09b9902..8eabfec 100644
> --- a/include/asm-ia64/processor.h
> +++ b/include/asm-ia64/processor.h
> @@ -50,7 +50,7 @@
> #define IA64_THREAD_PM_VALID (__IA64_UL(1) << 2) /* performance registers valid? */
> #define IA64_THREAD_UAC_NOPRINT (__IA64_UL(1) << 3) /* don't log unaligned accesses */
> #define IA64_THREAD_UAC_SIGBUS (__IA64_UL(1) << 4) /* generate SIGBUS on unaligned acc. */
> - /* bit 5 is currently unused */
> +#define IA64_THREAD_MIGRATION (__IA64_UL(1) << 5) /* require migration sync at ctx sw */
80 columns.
> #define IA64_THREAD_FPEMU_NOPRINT (__IA64_UL(1) << 6) /* don't log any fpswa faults */
> #define IA64_THREAD_FPEMU_SIGFPE (__IA64_UL(1) << 7) /* send a SIGFPE for fpswa faults */
>
> diff --git a/include/asm-ia64/system.h b/include/asm-ia64/system.h
> index 80c5a23..f382dea 100644
> --- a/include/asm-ia64/system.h
> +++ b/include/asm-ia64/system.h
> @@ -244,6 +244,12 @@ extern void ia64_load_extra (struct task
> __ia64_save_fpu((prev)->thread.fph); \
> } \
> __switch_to(prev, next, last); \
> + /* "next" in old context is "current" in new context */ \
> + if (unlikely((current->thread.flags & IA64_THREAD_MIGRATION) && \
> + (task_cpu(current) != task_thread_info(current)->last_cpu))) { \
> + platform_migrate(current); \
> + task_thread_info(current)->last_cpu = task_cpu(current); \
> + } \
> } while (0)
whole segment exceeds 80 columns.
> #else
> # define switch_to(prev,next,last) __switch_to(prev, next, last)
> diff --git a/include/asm-ia64/thread_info.h b/include/asm-ia64/thread_info.h
> index 1d6518f..81641a6 100644
> --- a/include/asm-ia64/thread_info.h
> +++ b/include/asm-ia64/thread_info.h
> @@ -26,6 +26,7 @@ struct thread_info {
> struct exec_domain *exec_domain;/* execution domain */
> __u32 flags; /* thread_info flags (see TIF_*) */
> __u32 cpu; /* current CPU */
> + __u32 last_cpu; /* Last CPU thread ran on */
> mm_segment_t addr_limit; /* user-level address space limit */
> int preempt_count; /* 0=premptable, <0=BUG; will also serve as bh-counter */
> struct restart_block restart_block;
^ permalink raw reply [flat|nested] 73+ messages in thread* Re: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (49 preceding siblings ...)
2006-01-26 1:31 ` Prarit Bhargava
@ 2006-01-26 2:43 ` Keith Owens
2006-01-26 4:40 ` Brent Casavant
` (13 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Keith Owens @ 2006-01-26 2:43 UTC (permalink / raw)
To: linux-ia64
Brent Casavant (on Wed, 25 Jan 2006 11:48:57 -0600 (CST)) wrote:
>On Wed, 25 Jan 2006, Keith Owens wrote:
>
>> Brent Casavant (on Tue, 24 Jan 2006 15:12:48 -0600 (CST)) wrote:
>> >+void sn_switch_from(struct task_struct *task)
>> >+{
>> >+ pda_t *last_pda = pdacpu(task_thread_info(task)->last_cpu);
>> >+ volatile unsigned long *adr = last_pda->pio_write_status_addr;
>> >+ unsigned long val = last_pda->pio_write_status_val;
>> >+
>> >+ /* Drain PIO writes from old CPU's Shub */
>> >+ while ((*adr & SH_PIO_WRITE_STATUS_PENDING_WRITE_COUNT_MASK) != val)
>> >+ cpu_relax();
>> >+}
>>
>> cpu_relax() maps to ia64_hint(ia64_hint_pause) which is defined as a
>> memory clobber, so you do not need to mark adr as volatile. Linus
>> recommends against relying on volatile in C code, memory barrier
>> operations like cpu_relax should be used instead.
>
>Hmm. Does that mean that the SN mmiowb() implementation is incorrect
>as well? Because this was pretty much a cut-n-paste job from there.
Yes.
>I can probably fix that (mmiowb()) in the near future, as there's
>some upcoming work needed in that area.
>
>> Alas include/asm-ia64/intel_intrin.h defines ia64_hint() as a no-op.
>
>Err, so we should hold off on fixing this plus mmiowb() until the
>intrinsic is fixed, correct?
Yes.
^ permalink raw reply [flat|nested] 73+ messages in thread* RE: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (50 preceding siblings ...)
2006-01-26 2:43 ` Keith Owens
@ 2006-01-26 4:40 ` Brent Casavant
2006-01-26 16:29 ` Brent Casavant
` (12 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Brent Casavant @ 2006-01-26 4:40 UTC (permalink / raw)
To: linux-ia64
On Wed, 25 Jan 2006, Luck, Tony wrote:
> says that the changelog comes first, and the random commentary
> comes after the "---" line (see paragraph "g"). You have this
> backwards, so git attempted to use the wrong bit of text as the
> changelog.
Sorry about that.
> > Signed-off-by: Brent Casavant <bcasavan@sgi.com>
> > Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
> > Signed-off-by: Brent Casavant <bcasavan@sgi.com>
>
> Perhaps a little presumptious to assume that Ken is ok with
> the combined version of his part of this patch (ok, I checked
> with Ken, and as it happens he is ok with it). One signoff from
> you would be sufficient.
Sorry. Wasn't sure how to handle the "I got part of this from
so-and-so, but hacked on it myself, and here's the new version".
No offense or presumption intended.
I get a new patch fired off in the morning (at home right now,
pain to edit things remotely).
Brent
--
Brent Casavant All music is folk music. I ain't
bcasavan@sgi.com never heard a horse sing a song.
Silicon Graphics, Inc. -- Louis Armstrong
^ permalink raw reply [flat|nested] 73+ messages in thread* RE: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (51 preceding siblings ...)
2006-01-26 4:40 ` Brent Casavant
@ 2006-01-26 16:29 ` Brent Casavant
2006-01-26 16:41 ` Prarit Bhargava
` (11 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Brent Casavant @ 2006-01-26 16:29 UTC (permalink / raw)
To: linux-ia64
On SN2, MMIO writes which are issued from separate processors are not
guaranteed to arrive in any particular order at the IO hardware. When
performing such writes from the kernel this is not a problem, as a
kernel thread will not migrate to another CPU during execution, and
mmiowb() calls can guarantee write ordering when control of the IO
resource is allowed to move between threads.
However, when MMIO writes can be performed from user space (e.g. DRM)
there are no such guarantees and mechanisms, as the process may
context-switch at any time, and may migrate to a different CPU as part
of the switch. For such programs/hardware to operate correctly, it is
required that the MMIO writes from the old CPU be accepted by the IO
hardware before subsequent writes from the new CPU can be issued.
The following patch implements this behavior on SN2 by waiting for a
Shub register to indicate that these writes have been accepted. This
is placed in the context switch-in path, and only performs the wait
when the newly scheduled task changes CPUs.
Signed-off-by: Brent Casavant <bcasavan@sgi.com>
---
Take 6. Fixed build warnings. This reintroduces the "volatile"
specifier that Keith and Ken want removed, as the pda_t has
volatile specifiers in its member elements, and as Keith said
that we should keep the specifier until the icc ia64_hint()
macro is fixed (Ken currently has a patch pending).
arch/ia64/sn/kernel/setup.c | 6 ++++--
arch/ia64/sn/kernel/sn2/sn2_smp.c | 22 +++++++++++++++++++++-
include/asm-ia64/machvec.h | 13 +++++++++++++
include/asm-ia64/machvec_sn2.h | 4 +++-
include/asm-ia64/processor.h | 2 +-
include/asm-ia64/system.h | 6 ++++++
include/asm-ia64/thread_info.h | 1 +
7 files changed, 49 insertions(+), 5 deletions(-)
diff --git a/arch/ia64/sn/kernel/setup.c b/arch/ia64/sn/kernel/setup.c
index e510dce..06f08b3 100644
--- a/arch/ia64/sn/kernel/setup.c
+++ b/arch/ia64/sn/kernel/setup.c
@@ -3,7 +3,7 @@
* License. See the file "COPYING" in the main directory of this archive
* for more details.
*
- * Copyright (C) 1999,2001-2005 Silicon Graphics, Inc. All rights reserved.
+ * Copyright (C) 1999,2001-2006 Silicon Graphics, Inc. All rights reserved.
*/
#include <linux/config.h>
@@ -496,6 +496,7 @@ void __init sn_setup(char **cmdline_p)
* for sn.
*/
pm_power_off = ia64_sn_power_down;
+ current->thread.flags |= IA64_THREAD_MIGRATION;
}
/**
@@ -654,7 +655,8 @@ void __init sn_cpu_init(void)
SH2_PIO_WRITE_STATUS_1, SH2_PIO_WRITE_STATUS_3};
u64 *pio;
pio = is_shub1() ? pio1 : pio2;
- pda->pio_write_status_addr = (volatile unsigned long *) LOCAL_MMR_ADDR(pio[slice]);
+ pda->pio_write_status_addr = (volatile unsigned long *)
+ GLOBAL_MMR_ADDR(nasid, pio[slice]);
pda->pio_write_status_val = is_shub1() ? SH_PIO_WRITE_STATUS_PENDING_WRITE_COUNT_MASK : 0;
}
diff --git a/arch/ia64/sn/kernel/sn2/sn2_smp.c b/arch/ia64/sn/kernel/sn2/sn2_smp.c
index 471bbaa..bd0e138 100644
--- a/arch/ia64/sn/kernel/sn2/sn2_smp.c
+++ b/arch/ia64/sn/kernel/sn2/sn2_smp.c
@@ -5,7 +5,7 @@
* License. See the file "COPYING" in the main directory of this archive
* for more details.
*
- * Copyright (C) 2000-2005 Silicon Graphics, Inc. All rights reserved.
+ * Copyright (C) 2000-2006 Silicon Graphics, Inc. All rights reserved.
*/
#include <linux/init.h>
@@ -169,6 +169,26 @@ static inline unsigned long wait_piowc(v
return ws;
}
+/**
+ * sn_migrate - SN-specific task migration actions
+ * @task: Task being migrated to new CPU
+ *
+ * SN2 PIO writes from separate CPUs are not guaranteed to arrive in order.
+ * Context switching user threads which have memory-mapped MMIO may cause
+ * PIOs to issue from seperate CPUs, thus the PIO writes must be drained
+ * from the previous CPU's Shub before execution resumes on the new CPU.
+ */
+void sn_migrate(struct task_struct *task)
+{
+ pda_t *last_pda = pdacpu(task_thread_info(task)->last_cpu);
+ volatile unsigned long *adr = last_pda->pio_write_status_addr;
+ unsigned long val = last_pda->pio_write_status_val;
+
+ /* Drain PIO writes from old CPU's Shub */
+ while (unlikely((*adr & SH_PIO_WRITE_STATUS_PENDING_WRITE_COUNT_MASK) != val))
+ cpu_relax();
+}
+
void sn_tlb_migrate_finish(struct mm_struct *mm)
{
if (mm = current->mm)
diff --git a/include/asm-ia64/machvec.h b/include/asm-ia64/machvec.h
index ca5ea99..c3e4ed8 100644
--- a/include/asm-ia64/machvec.h
+++ b/include/asm-ia64/machvec.h
@@ -20,6 +20,7 @@ struct scatterlist;
struct page;
struct mm_struct;
struct pci_bus;
+struct task_struct;
typedef void ia64_mv_setup_t (char **);
typedef void ia64_mv_cpu_init_t (void);
@@ -34,6 +35,7 @@ typedef int ia64_mv_pci_legacy_read_t (s
u8 size);
typedef int ia64_mv_pci_legacy_write_t (struct pci_bus *, u16 port, u32 val,
u8 size);
+typedef void ia64_mv_migrate_t(struct task_struct * task);
/* DMA-mapping interface: */
typedef void ia64_mv_dma_init (void);
@@ -85,6 +87,11 @@ machvec_noop_mm (struct mm_struct *mm)
{
}
+static inline void
+machvec_noop_task (struct task_struct *task)
+{
+}
+
extern void machvec_setup (char **);
extern void machvec_timer_interrupt (int, void *, struct pt_regs *);
extern void machvec_dma_sync_single (struct device *, dma_addr_t, size_t, int);
@@ -146,6 +153,7 @@ extern void machvec_tlb_migrate_finish (
# define platform_readw_relaxed ia64_mv.readw_relaxed
# define platform_readl_relaxed ia64_mv.readl_relaxed
# define platform_readq_relaxed ia64_mv.readq_relaxed
+# define platform_migrate ia64_mv.migrate
# endif
/* __attribute__((__aligned__(16))) is required to make size of the
@@ -194,6 +202,7 @@ struct ia64_machine_vector {
ia64_mv_readw_relaxed_t *readw_relaxed;
ia64_mv_readl_relaxed_t *readl_relaxed;
ia64_mv_readq_relaxed_t *readq_relaxed;
+ ia64_mv_migrate_t *migrate;
} __attribute__((__aligned__(16))); /* align attrib? see above comment */
#define MACHVEC_INIT(name) \
@@ -238,6 +247,7 @@ struct ia64_machine_vector {
platform_readw_relaxed, \
platform_readl_relaxed, \
platform_readq_relaxed, \
+ platform_migrate, \
}
extern struct ia64_machine_vector ia64_mv;
@@ -386,5 +396,8 @@ extern ia64_mv_dma_supported swiotlb_dm
#ifndef platform_readq_relaxed
# define platform_readq_relaxed __ia64_readq_relaxed
#endif
+#ifndef platform_migrate
+# define platform_migrate machvec_noop_task
+#endif
#endif /* _ASM_IA64_MACHVEC_H */
diff --git a/include/asm-ia64/machvec_sn2.h b/include/asm-ia64/machvec_sn2.h
index e1b6cd6..6f0021b 100644
--- a/include/asm-ia64/machvec_sn2.h
+++ b/include/asm-ia64/machvec_sn2.h
@@ -1,5 +1,5 @@
/*
- * Copyright (c) 2002-2003 Silicon Graphics, Inc. All Rights Reserved.
+ * Copyright (c) 2002-2003,2006 Silicon Graphics, Inc. All Rights Reserved.
*
* This program is free software; you can redistribute it and/or modify it
* under the terms of version 2 of the GNU General Public License
@@ -71,6 +71,7 @@ extern ia64_mv_dma_sync_single_for_devic
extern ia64_mv_dma_sync_sg_for_device sn_dma_sync_sg_for_device;
extern ia64_mv_dma_mapping_error sn_dma_mapping_error;
extern ia64_mv_dma_supported sn_dma_supported;
+extern ia64_mv_migrate_t sn_migrate;
/*
* This stuff has dual use!
@@ -120,6 +121,7 @@ extern ia64_mv_dma_supported sn_dma_sup
#define platform_dma_sync_sg_for_device sn_dma_sync_sg_for_device
#define platform_dma_mapping_error sn_dma_mapping_error
#define platform_dma_supported sn_dma_supported
+#define platform_migrate sn_migrate
#include <asm/sn/io.h>
diff --git a/include/asm-ia64/processor.h b/include/asm-ia64/processor.h
index 09b9902..8eabfec 100644
--- a/include/asm-ia64/processor.h
+++ b/include/asm-ia64/processor.h
@@ -50,7 +50,7 @@
#define IA64_THREAD_PM_VALID (__IA64_UL(1) << 2) /* performance registers valid? */
#define IA64_THREAD_UAC_NOPRINT (__IA64_UL(1) << 3) /* don't log unaligned accesses */
#define IA64_THREAD_UAC_SIGBUS (__IA64_UL(1) << 4) /* generate SIGBUS on unaligned acc. */
- /* bit 5 is currently unused */
+#define IA64_THREAD_MIGRATION (__IA64_UL(1) << 5) /* require migration sync at ctx sw */
#define IA64_THREAD_FPEMU_NOPRINT (__IA64_UL(1) << 6) /* don't log any fpswa faults */
#define IA64_THREAD_FPEMU_SIGFPE (__IA64_UL(1) << 7) /* send a SIGFPE for fpswa faults */
diff --git a/include/asm-ia64/system.h b/include/asm-ia64/system.h
index 80c5a23..f382dea 100644
--- a/include/asm-ia64/system.h
+++ b/include/asm-ia64/system.h
@@ -244,6 +244,12 @@ extern void ia64_load_extra (struct task
__ia64_save_fpu((prev)->thread.fph); \
} \
__switch_to(prev, next, last); \
+ /* "next" in old context is "current" in new context */ \
+ if (unlikely((current->thread.flags & IA64_THREAD_MIGRATION) && \
+ (task_cpu(current) != task_thread_info(current)->last_cpu))) { \
+ platform_migrate(current); \
+ task_thread_info(current)->last_cpu = task_cpu(current); \
+ } \
} while (0)
#else
# define switch_to(prev,next,last) __switch_to(prev, next, last)
diff --git a/include/asm-ia64/thread_info.h b/include/asm-ia64/thread_info.h
index 1d6518f..81641a6 100644
--- a/include/asm-ia64/thread_info.h
+++ b/include/asm-ia64/thread_info.h
@@ -26,6 +26,7 @@ struct thread_info {
struct exec_domain *exec_domain;/* execution domain */
__u32 flags; /* thread_info flags (see TIF_*) */
__u32 cpu; /* current CPU */
+ __u32 last_cpu; /* Last CPU thread ran on */
mm_segment_t addr_limit; /* user-level address space limit */
int preempt_count; /* 0=premptable, <0=BUG; will also serve as bh-counter */
struct restart_block restart_block;
^ permalink raw reply related [flat|nested] 73+ messages in thread* Re: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (52 preceding siblings ...)
2006-01-26 16:29 ` Brent Casavant
@ 2006-01-26 16:41 ` Prarit Bhargava
2006-01-26 19:29 ` Brent Casavant
` (10 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Prarit Bhargava @ 2006-01-26 16:41 UTC (permalink / raw)
To: linux-ia64
Brent Casavant wrote:
>
> Signed-off-by: Brent Casavant <bcasavan@sgi.com>
Nak. CodingStyle issues.
>
> ---
>
> Take 6. Fixed build warnings. This reintroduces the "volatile"
> specifier that Keith and Ken want removed, as the pda_t has
> volatile specifiers in its member elements, and as Keith said
> that we should keep the specifier until the icc ia64_hint()
> macro is fixed (Ken currently has a patch pending).
>
> arch/ia64/sn/kernel/setup.c | 6 ++++--
> arch/ia64/sn/kernel/sn2/sn2_smp.c | 22 +++++++++++++++++++++-
> include/asm-ia64/machvec.h | 13 +++++++++++++
> include/asm-ia64/machvec_sn2.h | 4 +++-
> include/asm-ia64/processor.h | 2 +-
> include/asm-ia64/system.h | 6 ++++++
> include/asm-ia64/thread_info.h | 1 +
> 7 files changed, 49 insertions(+), 5 deletions(-)
>
> diff --git a/arch/ia64/sn/kernel/setup.c b/arch/ia64/sn/kernel/setup.c
> index e510dce..06f08b3 100644
> --- a/arch/ia64/sn/kernel/setup.c
> +++ b/arch/ia64/sn/kernel/setup.c
> @@ -3,7 +3,7 @@
> * License. See the file "COPYING" in the main directory of this archive
> * for more details.
> *
> - * Copyright (C) 1999,2001-2005 Silicon Graphics, Inc. All rights reserved.
> + * Copyright (C) 1999,2001-2006 Silicon Graphics, Inc. All rights reserved.
> */
>
> #include <linux/config.h>
> @@ -496,6 +496,7 @@ void __init sn_setup(char **cmdline_p)
> * for sn.
> */
> pm_power_off = ia64_sn_power_down;
> + current->thread.flags |= IA64_THREAD_MIGRATION;
> }
>
> /**
> @@ -654,7 +655,8 @@ void __init sn_cpu_init(void)
> SH2_PIO_WRITE_STATUS_1, SH2_PIO_WRITE_STATUS_3};
> u64 *pio;
> pio = is_shub1() ? pio1 : pio2;
> - pda->pio_write_status_addr = (volatile unsigned long *) LOCAL_MMR_ADDR(pio[slice]);
> + pda->pio_write_status_addr = (volatile unsigned long *)
> + GLOBAL_MMR_ADDR(nasid, pio[slice]);
ALIGNMENT.
> pda->pio_write_status_val = is_shub1() ? SH_PIO_WRITE_STATUS_PENDING_WRITE_COUNT_MASK : 0;
> }
>
> diff --git a/arch/ia64/sn/kernel/sn2/sn2_smp.c b/arch/ia64/sn/kernel/sn2/sn2_smp.c
> index 471bbaa..bd0e138 100644
> --- a/arch/ia64/sn/kernel/sn2/sn2_smp.c
> +++ b/arch/ia64/sn/kernel/sn2/sn2_smp.c
> @@ -5,7 +5,7 @@
> * License. See the file "COPYING" in the main directory of this archive
> * for more details.
> *
> - * Copyright (C) 2000-2005 Silicon Graphics, Inc. All rights reserved.
> + * Copyright (C) 2000-2006 Silicon Graphics, Inc. All rights reserved.
> */
>
> #include <linux/init.h>
> @@ -169,6 +169,26 @@ static inline unsigned long wait_piowc(v
> return ws;
> }
>
> +/**
> + * sn_migrate - SN-specific task migration actions
> + * @task: Task being migrated to new CPU
> + *
> + * SN2 PIO writes from separate CPUs are not guaranteed to arrive in order.
> + * Context switching user threads which have memory-mapped MMIO may cause
> + * PIOs to issue from seperate CPUs, thus the PIO writes must be drained
> + * from the previous CPU's Shub before execution resumes on the new CPU.
> + */
> +void sn_migrate(struct task_struct *task)
> +{
> + pda_t *last_pda = pdacpu(task_thread_info(task)->last_cpu);
> + volatile unsigned long *adr = last_pda->pio_write_status_addr;
> + unsigned long val = last_pda->pio_write_status_val;
> +
> + /* Drain PIO writes from old CPU's Shub */
> + while (unlikely((*adr & SH_PIO_WRITE_STATUS_PENDING_WRITE_COUNT_MASK) != val))
80 COLUMNS.
> + cpu_relax();
> +}
> +
> void sn_tlb_migrate_finish(struct mm_struct *mm)
> {
> if (mm = current->mm)
> diff --git a/include/asm-ia64/machvec.h b/include/asm-ia64/machvec.h
> index ca5ea99..c3e4ed8 100644
> --- a/include/asm-ia64/machvec.h
> +++ b/include/asm-ia64/machvec.h
> @@ -20,6 +20,7 @@ struct scatterlist;
> struct page;
> struct mm_struct;
> struct pci_bus;
> +struct task_struct;
>
> typedef void ia64_mv_setup_t (char **);
> typedef void ia64_mv_cpu_init_t (void);
> @@ -34,6 +35,7 @@ typedef int ia64_mv_pci_legacy_read_t (s
> u8 size);
> typedef int ia64_mv_pci_legacy_write_t (struct pci_bus *, u16 port, u32 val,
> u8 size);
> +typedef void ia64_mv_migrate_t(struct task_struct * task);
>
> /* DMA-mapping interface: */
> typedef void ia64_mv_dma_init (void);
> @@ -85,6 +87,11 @@ machvec_noop_mm (struct mm_struct *mm)
> {
> }
>
> +static inline void
> +machvec_noop_task (struct task_struct *task)
> +{
> +}
> +
> extern void machvec_setup (char **);
> extern void machvec_timer_interrupt (int, void *, struct pt_regs *);
> extern void machvec_dma_sync_single (struct device *, dma_addr_t, size_t, int);
> @@ -146,6 +153,7 @@ extern void machvec_tlb_migrate_finish (
> # define platform_readw_relaxed ia64_mv.readw_relaxed
> # define platform_readl_relaxed ia64_mv.readl_relaxed
> # define platform_readq_relaxed ia64_mv.readq_relaxed
> +# define platform_migrate ia64_mv.migrate
> # endif
>
> /* __attribute__((__aligned__(16))) is required to make size of the
> @@ -194,6 +202,7 @@ struct ia64_machine_vector {
> ia64_mv_readw_relaxed_t *readw_relaxed;
> ia64_mv_readl_relaxed_t *readl_relaxed;
> ia64_mv_readq_relaxed_t *readq_relaxed;
> + ia64_mv_migrate_t *migrate;
> } __attribute__((__aligned__(16))); /* align attrib? see above comment */
>
> #define MACHVEC_INIT(name) \
> @@ -238,6 +247,7 @@ struct ia64_machine_vector {
> platform_readw_relaxed, \
> platform_readl_relaxed, \
> platform_readq_relaxed, \
> + platform_migrate, \
> }
>
> extern struct ia64_machine_vector ia64_mv;
> @@ -386,5 +396,8 @@ extern ia64_mv_dma_supported swiotlb_dm
> #ifndef platform_readq_relaxed
> # define platform_readq_relaxed __ia64_readq_relaxed
> #endif
> +#ifndef platform_migrate
> +# define platform_migrate machvec_noop_task
> +#endif
>
> #endif /* _ASM_IA64_MACHVEC_H */
> diff --git a/include/asm-ia64/machvec_sn2.h b/include/asm-ia64/machvec_sn2.h
> index e1b6cd6..6f0021b 100644
> --- a/include/asm-ia64/machvec_sn2.h
> +++ b/include/asm-ia64/machvec_sn2.h
> @@ -1,5 +1,5 @@
> /*
> - * Copyright (c) 2002-2003 Silicon Graphics, Inc. All Rights Reserved.
> + * Copyright (c) 2002-2003,2006 Silicon Graphics, Inc. All Rights Reserved.
> *
> * This program is free software; you can redistribute it and/or modify it
> * under the terms of version 2 of the GNU General Public License
> @@ -71,6 +71,7 @@ extern ia64_mv_dma_sync_single_for_devic
> extern ia64_mv_dma_sync_sg_for_device sn_dma_sync_sg_for_device;
> extern ia64_mv_dma_mapping_error sn_dma_mapping_error;
> extern ia64_mv_dma_supported sn_dma_supported;
> +extern ia64_mv_migrate_t sn_migrate;
>
> /*
> * This stuff has dual use!
> @@ -120,6 +121,7 @@ extern ia64_mv_dma_supported sn_dma_sup
> #define platform_dma_sync_sg_for_device sn_dma_sync_sg_for_device
> #define platform_dma_mapping_error sn_dma_mapping_error
> #define platform_dma_supported sn_dma_supported
> +#define platform_migrate sn_migrate
>
> #include <asm/sn/io.h>
>
> diff --git a/include/asm-ia64/processor.h b/include/asm-ia64/processor.h
> index 09b9902..8eabfec 100644
> --- a/include/asm-ia64/processor.h
> +++ b/include/asm-ia64/processor.h
> @@ -50,7 +50,7 @@
> #define IA64_THREAD_PM_VALID (__IA64_UL(1) << 2) /* performance registers valid? */
> #define IA64_THREAD_UAC_NOPRINT (__IA64_UL(1) << 3) /* don't log unaligned accesses */
> #define IA64_THREAD_UAC_SIGBUS (__IA64_UL(1) << 4) /* generate SIGBUS on unaligned acc. */
> - /* bit 5 is currently unused */
> +#define IA64_THREAD_MIGRATION (__IA64_UL(1) << 5) /* require migration sync at ctx sw */
80 COLUMNS.
> #define IA64_THREAD_FPEMU_NOPRINT (__IA64_UL(1) << 6) /* don't log any fpswa faults */
> #define IA64_THREAD_FPEMU_SIGFPE (__IA64_UL(1) << 7) /* send a SIGFPE for fpswa faults */
>
> diff --git a/include/asm-ia64/system.h b/include/asm-ia64/system.h
> index 80c5a23..f382dea 100644
> --- a/include/asm-ia64/system.h
> +++ b/include/asm-ia64/system.h
> @@ -244,6 +244,12 @@ extern void ia64_load_extra (struct task
> __ia64_save_fpu((prev)->thread.fph); \
> } \
> __switch_to(prev, next, last); \
> + /* "next" in old context is "current" in new context */ \
> + if (unlikely((current->thread.flags & IA64_THREAD_MIGRATION) && \
> + (task_cpu(current) != task_thread_info(current)->last_cpu))) { \
> + platform_migrate(current); \
> + task_thread_info(current)->last_cpu = task_cpu(current); \
> + } \
80 COLUMNS.
> } while (0)
> #else
> # define switch_to(prev,next,last) __switch_to(prev, next, last)
> diff --git a/include/asm-ia64/thread_info.h b/include/asm-ia64/thread_info.h
> index 1d6518f..81641a6 100644
> --- a/include/asm-ia64/thread_info.h
> +++ b/include/asm-ia64/thread_info.h
> @@ -26,6 +26,7 @@ struct thread_info {
> struct exec_domain *exec_domain;/* execution domain */
> __u32 flags; /* thread_info flags (see TIF_*) */
> __u32 cpu; /* current CPU */
> + __u32 last_cpu; /* Last CPU thread ran on */
> mm_segment_t addr_limit; /* user-level address space limit */
> int preempt_count; /* 0=premptable, <0=BUG; will also serve as bh-counter */
> struct restart_block restart_block;
> -
> To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 73+ messages in thread* Re: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (53 preceding siblings ...)
2006-01-26 16:41 ` Prarit Bhargava
@ 2006-01-26 19:29 ` Brent Casavant
2006-01-26 19:54 ` Luck, Tony
` (9 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Brent Casavant @ 2006-01-26 19:29 UTC (permalink / raw)
To: linux-ia64
On Thu, 26 Jan 2006, Prarit Bhargava wrote:
> > diff --git a/arch/ia64/sn/kernel/setup.c b/arch/ia64/sn/kernel/setup.c
> > index e510dce..06f08b3 100644
> > --- a/arch/ia64/sn/kernel/setup.c
> > +++ b/arch/ia64/sn/kernel/setup.c
> > @@ -654,7 +655,8 @@ void __init sn_cpu_init(void)
> > SH2_PIO_WRITE_STATUS_1, SH2_PIO_WRITE_STATUS_3};
> > u64 *pio;
> > pio = is_shub1() ? pio1 : pio2;
> > - pda->pio_write_status_addr = (volatile unsigned long *)
> > LOCAL_MMR_ADDR(pio[slice]);
> > + pda->pio_write_status_addr = (volatile unsigned long *)
> > + GLOBAL_MMR_ADDR(nasid, pio[slice]);
>
> ALIGNMENT.
I don't see a better way to handle it without violating 80 columns.
> > diff --git a/arch/ia64/sn/kernel/sn2/sn2_smp.c
> > b/arch/ia64/sn/kernel/sn2/sn2_smp.c
> > index 471bbaa..bd0e138 100644
> > --- a/arch/ia64/sn/kernel/sn2/sn2_smp.c
> > +++ b/arch/ia64/sn/kernel/sn2/sn2_smp.c
> > @@ -169,6 +169,26 @@ static inline unsigned long wait_piowc(v
> > return ws;
> > }
> > +/**
> > + * sn_migrate - SN-specific task migration actions
> > + * @task: Task being migrated to new CPU
> > + *
> > + * SN2 PIO writes from separate CPUs are not guaranteed to arrive in order.
> > + * Context switching user threads which have memory-mapped MMIO may cause
> > + * PIOs to issue from seperate CPUs, thus the PIO writes must be drained
> > + * from the previous CPU's Shub before execution resumes on the new CPU.
> > + */
> > +void sn_migrate(struct task_struct *task)
> > +{
> > + pda_t *last_pda = pdacpu(task_thread_info(task)->last_cpu);
> > + volatile unsigned long *adr = last_pda->pio_write_status_addr;
> > + unsigned long val = last_pda->pio_write_status_val;
> > +
> > + /* Drain PIO writes from old CPU's Shub */
> > + while (unlikely((*adr & SH_PIO_WRITE_STATUS_PENDING_WRITE_COUNT_MASK)
> > != val))
>
> 80 COLUMNS.
Again, tell me how to split this such that it fits in 80 and I'll gladly
do it. I don't see it.
> > diff --git a/include/asm-ia64/processor.h b/include/asm-ia64/processor.h
> > index 09b9902..8eabfec 100644
> > --- a/include/asm-ia64/processor.h
> > +++ b/include/asm-ia64/processor.h
> > @@ -50,7 +50,7 @@
> > #define IA64_THREAD_PM_VALID (__IA64_UL(1) << 2) /* performance
> > registers valid? */
> > #define IA64_THREAD_UAC_NOPRINT (__IA64_UL(1) << 3) /* don't log
> > unaligned accesses */
> > #define IA64_THREAD_UAC_SIGBUS (__IA64_UL(1) << 4) /* generate
> > SIGBUS on unaligned acc. */
> > - /* bit 5 is currently
> > unused */
> > +#define IA64_THREAD_MIGRATION (__IA64_UL(1) << 5) /* require
> > migration sync at ctx sw */
>
> 80 COLUMNS.
Ditto.
> > diff --git a/include/asm-ia64/system.h b/include/asm-ia64/system.h
> > index 80c5a23..f382dea 100644
> > --- a/include/asm-ia64/system.h
> > +++ b/include/asm-ia64/system.h
> > @@ -244,6 +244,12 @@ extern void ia64_load_extra (struct task
> > __ia64_save_fpu((prev)->thread.fph);
> > \
> > }
> > \
> > __switch_to(prev, next, last);
> > \
> > + /* "next" in old context is "current" in new context */
> > \
> > + if (unlikely((current->thread.flags & IA64_THREAD_MIGRATION) &&
> > \
> > + (task_cpu(current) !> > task_thread_info(current)->last_cpu))) { \
> > + platform_migrate(current);
> > \
> > + task_thread_info(current)->last_cpu = task_cpu(current);
> > \
> > + }
> > \
>
> 80 COLUMNS.
Following existing precedent in the same macro, and again no better
way to do it.
Brent
--
Brent Casavant All music is folk music. I ain't
bcasavan@sgi.com never heard a horse sing a song.
Silicon Graphics, Inc. -- Louis Armstrong
^ permalink raw reply [flat|nested] 73+ messages in thread* RE: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (54 preceding siblings ...)
2006-01-26 19:29 ` Brent Casavant
@ 2006-01-26 19:54 ` Luck, Tony
2006-01-26 20:28 ` Brent Casavant
` (8 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Luck, Tony @ 2006-01-26 19:54 UTC (permalink / raw)
To: linux-ia64
>> > + while (unlikely((*adr & SH_PIO_WRITE_STATUS_PENDING_WRITE_COUNT_MASK) != val))
>> 80 COLUMNS.
>
>Again, tell me how to split this such that it fits in 80 and I'll gladly
>do it. I don't see it.
Perhaps "SH_PIO_WRITE_STATUS_PENDING_WRITE_COUNT_MASK" is just a bit too
verbose/descriptive? 44 characters in a name is just asking for problems
in an 80 column environment.
-Tony
Longest symbol in the kernel sources has you beat by a long way though:
ZORRO_PROD_PHASE5_BLIZZARD_1230_II_FASTLANE_Z3_CYBERSCSI_CYBERSTORM060
^ permalink raw reply [flat|nested] 73+ messages in thread* RE: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (55 preceding siblings ...)
2006-01-26 19:54 ` Luck, Tony
@ 2006-01-26 20:28 ` Brent Casavant
2006-01-26 21:05 ` Luck, Tony
` (7 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Brent Casavant @ 2006-01-26 20:28 UTC (permalink / raw)
To: linux-ia64
On Thu, 26 Jan 2006, Luck, Tony wrote:
> Perhaps "SH_PIO_WRITE_STATUS_PENDING_WRITE_COUNT_MASK" is just a bit too
> verbose/descriptive? 44 characters in a name is just asking for problems
> in an 80 column environment.
I agree. As changing that name would mean mixing what's essentially
a coding style cleanup with a "meaty" change, how would you like me
to proceed?
As I mentioned earlier in this discussion, there are some changes
needed in the area of mmiowb(), which is the primary user of this
definition. Perhaps such a change is best left for that effort?
Brent
--
Brent Casavant All music is folk music. I ain't
bcasavan@sgi.com never heard a horse sing a song.
Silicon Graphics, Inc. -- Louis Armstrong
^ permalink raw reply [flat|nested] 73+ messages in thread* RE: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (56 preceding siblings ...)
2006-01-26 20:28 ` Brent Casavant
@ 2006-01-26 21:05 ` Luck, Tony
2006-01-26 21:34 ` Prarit Bhargava
` (6 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Luck, Tony @ 2006-01-26 21:05 UTC (permalink / raw)
To: linux-ia64
> I agree. As changing that name would mean mixing what's essentially
> a coding style cleanup with a "meaty" change, how would you like me
> to proceed?
>
> As I mentioned earlier in this discussion, there are some changes
> needed in the area of mmiowb(), which is the primary user of this
> definition. Perhaps such a change is best left for that effort?
I'll take "take 6" into the test tree (and onto Linus for 2.6.17).
Please don't forget to come back and do the cleanups.
-Tony
^ permalink raw reply [flat|nested] 73+ messages in thread* Re: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (57 preceding siblings ...)
2006-01-26 21:05 ` Luck, Tony
@ 2006-01-26 21:34 ` Prarit Bhargava
2006-01-26 22:11 ` Luck, Tony
` (5 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Prarit Bhargava @ 2006-01-26 21:34 UTC (permalink / raw)
To: linux-ia64
[-- Attachment #1: Type: text/plain, Size: 51 bytes --]
Tony -- please take this version if possible.
P.
[-- Attachment #2: newpatch.patch --]
[-- Type: text/plain, Size: 9836 bytes --]
On SN2, MMIO writes which are issued from separate processors are not
guaranteed to arrive in any particular order at the IO hardware. When
performing such writes from the kernel this is not a problem, as a
kernel thread will not migrate to another CPU during execution, and
mmiowb() calls can guarantee write ordering when control of the IO
resource is allowed to move between threads.
However, when MMIO writes can be performed from user space (e.g. DRM)
there are no such guarantees and mechanisms, as the process may
context-switch at any time, and may migrate to a different CPU as part
of the switch. For such programs/hardware to operate correctly, it is
required that the MMIO writes from the old CPU be accepted by the IO
hardware before subsequent writes from the new CPU can be issued.
The following patch implements this behavior on SN2 by waiting for a
Shub register to indicate that these writes have been accepted. This
is placed in the context switch-in path, and only performs the wait
when the newly scheduled task changes CPUs.
Signed-off-by: Brent Casavant <bcasavan@sgi.com>
---
Take 6. Fixed build warnings. This reintroduces the "volatile"
specifier that Keith and Ken want removed, as the pda_t has
volatile specifiers in its member elements, and as Keith said
that we should keep the specifier until the icc ia64_hint()
macro is fixed (Ken currently has a patch pending).
---
commit 2052b6c00ba62f4dc9bedf7ee6f5400b118cf982
tree 95847e201979c0f8d54d27226e0eeb52893d99f2
parent 3ee68c4af3fd7228c1be63254b9f884614f9ebb2
author Brent Casavant <bcasavan@sgi.com> Thu, 26 Jan 2006 17:24:00 -0500
committer Brent Casavant <bcasavan@sgi.com> Thu, 26 Jan 2006 17:24:00 -0500
arch/ia64/sn/kernel/setup.c | 6 ++++--
arch/ia64/sn/kernel/sn2/sn2_smp.c | 23 ++++++++++++++++++++++-
include/asm-ia64/machvec.h | 13 +++++++++++++
include/asm-ia64/machvec_sn2.h | 4 +++-
include/asm-ia64/processor.h | 3 ++-
include/asm-ia64/system.h | 7 +++++++
include/asm-ia64/thread_info.h | 1 +
7 files changed, 52 insertions(+), 5 deletions(-)
diff --git a/arch/ia64/sn/kernel/setup.c b/arch/ia64/sn/kernel/setup.c
--- a/arch/ia64/sn/kernel/setup.c
+++ b/arch/ia64/sn/kernel/setup.c
@@ -3,7 +3,7 @@
* License. See the file "COPYING" in the main directory of this archive
* for more details.
*
- * Copyright (C) 1999,2001-2005 Silicon Graphics, Inc. All rights reserved.
+ * Copyright (C) 1999,2001-2006 Silicon Graphics, Inc. All rights reserved.
*/
#include <linux/config.h>
@@ -496,6 +496,7 @@ void __init sn_setup(char **cmdline_p)
* for sn.
*/
pm_power_off = ia64_sn_power_down;
+ current->thread.flags |= IA64_THREAD_MIGRATION;
}
/**
@@ -654,7 +655,8 @@ void __init sn_cpu_init(void)
SH2_PIO_WRITE_STATUS_1, SH2_PIO_WRITE_STATUS_3};
u64 *pio;
pio = is_shub1() ? pio1 : pio2;
- pda->pio_write_status_addr = (volatile unsigned long *) LOCAL_MMR_ADDR(pio[slice]);
+ pda->pio_write_status_addr =
+ (volatile unsigned long *)GLOBAL_MMR_ADDR(nasid, pio[slice]);
pda->pio_write_status_val = is_shub1() ? SH_PIO_WRITE_STATUS_PENDING_WRITE_COUNT_MASK : 0;
}
diff --git a/arch/ia64/sn/kernel/sn2/sn2_smp.c b/arch/ia64/sn/kernel/sn2/sn2_smp.c
--- a/arch/ia64/sn/kernel/sn2/sn2_smp.c
+++ b/arch/ia64/sn/kernel/sn2/sn2_smp.c
@@ -5,7 +5,7 @@
* License. See the file "COPYING" in the main directory of this archive
* for more details.
*
- * Copyright (C) 2000-2005 Silicon Graphics, Inc. All rights reserved.
+ * Copyright (C) 2000-2006 Silicon Graphics, Inc. All rights reserved.
*/
#include <linux/init.h>
@@ -169,6 +169,27 @@ static inline unsigned long wait_piowc(v
return ws;
}
+/**
+ * sn_migrate - SN-specific task migration actions
+ * @task: Task being migrated to new CPU
+ *
+ * SN2 PIO writes from separate CPUs are not guaranteed to arrive in order.
+ * Context switching user threads which have memory-mapped MMIO may cause
+ * PIOs to issue from seperate CPUs, thus the PIO writes must be drained
+ * from the previous CPU's Shub before execution resumes on the new CPU.
+ */
+void sn_migrate(struct task_struct *task)
+{
+ pda_t *last_pda = pdacpu(task_thread_info(task)->last_cpu);
+ unsigned long *adr = last_pda->pio_write_status_addr;;
+ unsigned long val = last_pda->pio_write_status_val;
+
+ /* Drain PIO writes from old CPU's Shub */
+ while (unlikely((*adr & SH_PIO_WRITE_STATUS_PENDING_WRITE_COUNT_MASK)
+ != val))
+ cpu_relax();
+}
+
void sn_tlb_migrate_finish(struct mm_struct *mm)
{
if (mm == current->mm)
diff --git a/include/asm-ia64/machvec.h b/include/asm-ia64/machvec.h
--- a/include/asm-ia64/machvec.h
+++ b/include/asm-ia64/machvec.h
@@ -20,6 +20,7 @@ struct scatterlist;
struct page;
struct mm_struct;
struct pci_bus;
+struct task_struct;
typedef void ia64_mv_setup_t (char **);
typedef void ia64_mv_cpu_init_t (void);
@@ -34,6 +35,7 @@ typedef int ia64_mv_pci_legacy_read_t (s
u8 size);
typedef int ia64_mv_pci_legacy_write_t (struct pci_bus *, u16 port, u32 val,
u8 size);
+typedef void ia64_mv_migrate_t(struct task_struct * task);
/* DMA-mapping interface: */
typedef void ia64_mv_dma_init (void);
@@ -85,6 +87,11 @@ machvec_noop_mm (struct mm_struct *mm)
{
}
+static inline void
+machvec_noop_task (struct task_struct *task)
+{
+}
+
extern void machvec_setup (char **);
extern void machvec_timer_interrupt (int, void *, struct pt_regs *);
extern void machvec_dma_sync_single (struct device *, dma_addr_t, size_t, int);
@@ -146,6 +153,7 @@ extern void machvec_tlb_migrate_finish (
# define platform_readw_relaxed ia64_mv.readw_relaxed
# define platform_readl_relaxed ia64_mv.readl_relaxed
# define platform_readq_relaxed ia64_mv.readq_relaxed
+# define platform_migrate ia64_mv.migrate
# endif
/* __attribute__((__aligned__(16))) is required to make size of the
@@ -194,6 +202,7 @@ struct ia64_machine_vector {
ia64_mv_readw_relaxed_t *readw_relaxed;
ia64_mv_readl_relaxed_t *readl_relaxed;
ia64_mv_readq_relaxed_t *readq_relaxed;
+ ia64_mv_migrate_t *migrate;
} __attribute__((__aligned__(16))); /* align attrib? see above comment */
#define MACHVEC_INIT(name) \
@@ -238,6 +247,7 @@ struct ia64_machine_vector {
platform_readw_relaxed, \
platform_readl_relaxed, \
platform_readq_relaxed, \
+ platform_migrate, \
}
extern struct ia64_machine_vector ia64_mv;
@@ -386,5 +396,8 @@ extern ia64_mv_dma_supported swiotlb_dm
#ifndef platform_readq_relaxed
# define platform_readq_relaxed __ia64_readq_relaxed
#endif
+#ifndef platform_migrate
+# define platform_migrate machvec_noop_task
+#endif
#endif /* _ASM_IA64_MACHVEC_H */
diff --git a/include/asm-ia64/machvec_sn2.h b/include/asm-ia64/machvec_sn2.h
--- a/include/asm-ia64/machvec_sn2.h
+++ b/include/asm-ia64/machvec_sn2.h
@@ -1,5 +1,5 @@
/*
- * Copyright (c) 2002-2003 Silicon Graphics, Inc. All Rights Reserved.
+ * Copyright (c) 2002-2003,2006 Silicon Graphics, Inc. All Rights Reserved.
*
* This program is free software; you can redistribute it and/or modify it
* under the terms of version 2 of the GNU General Public License
@@ -71,6 +71,7 @@ extern ia64_mv_dma_sync_single_for_devic
extern ia64_mv_dma_sync_sg_for_device sn_dma_sync_sg_for_device;
extern ia64_mv_dma_mapping_error sn_dma_mapping_error;
extern ia64_mv_dma_supported sn_dma_supported;
+extern ia64_mv_migrate_t sn_migrate;
/*
* This stuff has dual use!
@@ -120,6 +121,7 @@ extern ia64_mv_dma_supported sn_dma_sup
#define platform_dma_sync_sg_for_device sn_dma_sync_sg_for_device
#define platform_dma_mapping_error sn_dma_mapping_error
#define platform_dma_supported sn_dma_supported
+#define platform_migrate sn_migrate
#include <asm/sn/io.h>
diff --git a/include/asm-ia64/processor.h b/include/asm-ia64/processor.h
--- a/include/asm-ia64/processor.h
+++ b/include/asm-ia64/processor.h
@@ -50,7 +50,8 @@
#define IA64_THREAD_PM_VALID (__IA64_UL(1) << 2) /* performance registers valid? */
#define IA64_THREAD_UAC_NOPRINT (__IA64_UL(1) << 3) /* don't log unaligned accesses */
#define IA64_THREAD_UAC_SIGBUS (__IA64_UL(1) << 4) /* generate SIGBUS on unaligned acc. */
- /* bit 5 is currently unused */
+#define IA64_THREAD_MIGRATION (__IA64_UL(1) << 5) /* require migration
+ sync at ctx sw */
#define IA64_THREAD_FPEMU_NOPRINT (__IA64_UL(1) << 6) /* don't log any fpswa faults */
#define IA64_THREAD_FPEMU_SIGFPE (__IA64_UL(1) << 7) /* send a SIGFPE for fpswa faults */
diff --git a/include/asm-ia64/system.h b/include/asm-ia64/system.h
--- a/include/asm-ia64/system.h
+++ b/include/asm-ia64/system.h
@@ -244,6 +244,13 @@ extern void ia64_load_extra (struct task
__ia64_save_fpu((prev)->thread.fph); \
} \
__switch_to(prev, next, last); \
+ /* "next" in old context is "current" in new context */ \
+ if (unlikely((current->thread.flags & IA64_THREAD_MIGRATION) && \
+ (task_cpu(current) != \
+ task_thread_info(current)->last_cpu))) { \
+ platform_migrate(current); \
+ task_thread_info(current)->last_cpu = task_cpu(current); \
+ } \
} while (0)
#else
# define switch_to(prev,next,last) __switch_to(prev, next, last)
diff --git a/include/asm-ia64/thread_info.h b/include/asm-ia64/thread_info.h
--- a/include/asm-ia64/thread_info.h
+++ b/include/asm-ia64/thread_info.h
@@ -26,6 +26,7 @@ struct thread_info {
struct exec_domain *exec_domain;/* execution domain */
__u32 flags; /* thread_info flags (see TIF_*) */
__u32 cpu; /* current CPU */
+ __u32 last_cpu; /* Last CPU thread ran on */
mm_segment_t addr_limit; /* user-level address space limit */
int preempt_count; /* 0=premptable, <0=BUG; will also serve as bh-counter */
struct restart_block restart_block;
^ permalink raw reply [flat|nested] 73+ messages in thread* RE: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (58 preceding siblings ...)
2006-01-26 21:34 ` Prarit Bhargava
@ 2006-01-26 22:11 ` Luck, Tony
2006-01-26 23:08 ` Luck, Tony
` (4 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Luck, Tony @ 2006-01-26 22:11 UTC (permalink / raw)
To: linux-ia64
Ok, got it.
-Tony
-----Original Message-----
From: Prarit Bhargava [mailto:prarit@sgi.com]
Sent: Thursday, January 26, 2006 1:35 PM
To: Luck, Tony
Cc: Brent Casavant; linux-ia64@vger.kernel.org
Subject: Re: [PATCH] SN2 user-MMIO CPU migration
Tony -- please take this version if possible.
P.
^ permalink raw reply [flat|nested] 73+ messages in thread* RE: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (59 preceding siblings ...)
2006-01-26 22:11 ` Luck, Tony
@ 2006-01-26 23:08 ` Luck, Tony
2006-01-26 23:21 ` Prarit Bhargava
` (3 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Luck, Tony @ 2006-01-26 23:08 UTC (permalink / raw)
To: linux-ia64
But sadly it still says:
arch/ia64/sn/kernel/sn2/sn2_smp.c:110: warning: initialization discards qualifiers from pointer target type
arch/ia64/sn/kernel/sn2/sn2_smp.c:111: warning: ISO C90 forbids mixed declarations and code
The double ;; is responsible for error on 111, need an extra volatile on 110?
-Tony
^ permalink raw reply [flat|nested] 73+ messages in thread* Re: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (60 preceding siblings ...)
2006-01-26 23:08 ` Luck, Tony
@ 2006-01-26 23:21 ` Prarit Bhargava
2006-01-26 23:44 ` Prarit Bhargava
` (2 subsequent siblings)
64 siblings, 0 replies; 73+ messages in thread
From: Prarit Bhargava @ 2006-01-26 23:21 UTC (permalink / raw)
To: linux-ia64
Luck, Tony wrote:
> But sadly it still says:
>
> arch/ia64/sn/kernel/sn2/sn2_smp.c:110: warning: initialization discards qualifiers from pointer target type
> arch/ia64/sn/kernel/sn2/sn2_smp.c:111: warning: ISO C90 forbids mixed declarations and code
>
> The double ;; is responsible for error on 111, need an extra volatile on 110?
Sorry Tony -- I compiled and diffed against a compile with the original
patch. ie) I unwisely assumed that these errors had been cleaned up ...
Another patch coming quickly.
P.
^ permalink raw reply [flat|nested] 73+ messages in thread* Re: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (61 preceding siblings ...)
2006-01-26 23:21 ` Prarit Bhargava
@ 2006-01-26 23:44 ` Prarit Bhargava
2006-01-27 0:07 ` Chen, Kenneth W
2006-01-27 14:01 ` Prarit Bhargava
64 siblings, 0 replies; 73+ messages in thread
From: Prarit Bhargava @ 2006-01-26 23:44 UTC (permalink / raw)
To: linux-ia64
[-- Attachment #1: Type: text/plain, Size: 434 bytes --]
Luck, Tony wrote:
> But sadly it still says:
>
> arch/ia64/sn/kernel/sn2/sn2_smp.c:110: warning: initialization discards qualifiers from pointer target type
> arch/ia64/sn/kernel/sn2/sn2_smp.c:111: warning: ISO C90 forbids mixed declarations and code
>
> The double ;; is responsible for error on 111, need an extra volatile on 110?
Tony -- again, my apologies. This time I double-checked that sn2_smp.c
cleanly compiles.
P.
[-- Attachment #2: newpatch.patch --]
[-- Type: text/plain, Size: 9906 bytes --]
On SN2, MMIO writes which are issued from separate processors are not
guaranteed to arrive in any particular order at the IO hardware. When
performing such writes from the kernel this is not a problem, as a
kernel thread will not migrate to another CPU during execution, and
mmiowb() calls can guarantee write ordering when control of the IO
resource is allowed to move between threads.
However, when MMIO writes can be performed from user space (e.g. DRM)
there are no such guarantees and mechanisms, as the process may
context-switch at any time, and may migrate to a different CPU as part
of the switch. For such programs/hardware to operate correctly, it is
required that the MMIO writes from the old CPU be accepted by the IO
hardware before subsequent writes from the new CPU can be issued.
The following patch implements this behavior on SN2 by waiting for a
Shub register to indicate that these writes have been accepted. This
is placed in the context switch-in path, and only performs the wait
when the newly scheduled task changes CPUs.
Signed-off-by: Prarit Bhargava <prarit@sgi.com>
Signed-off-by: Brent Casavant <bcasavan@sgi.com>
---
Take 6. Fixed build warnings. This reintroduces the "volatile"
specifier that Keith and Ken want removed, as the pda_t has
volatile specifiers in its member elements, and as Keith said
that we should keep the specifier until the icc ia64_hint()
macro is fixed (Ken currently has a patch pending).
---
commit ff50f390c79bb090166bd1fdd653be22cbaca5d3
tree 251dd80647bd3a0140f5f31c35c125094c035f9c
parent 3ee68c4af3fd7228c1be63254b9f884614f9ebb2
author root <root@altix3.lab.boston.redhat.com> Thu, 26 Jan 2006 19:34:57 -0500
committer root <root@altix3.lab.boston.redhat.com> Thu, 26 Jan 2006 19:34:57 -0500
arch/ia64/sn/kernel/setup.c | 6 ++++--
arch/ia64/sn/kernel/sn2/sn2_smp.c | 23 ++++++++++++++++++++++-
include/asm-ia64/machvec.h | 13 +++++++++++++
include/asm-ia64/machvec_sn2.h | 4 +++-
include/asm-ia64/processor.h | 3 ++-
include/asm-ia64/system.h | 7 +++++++
include/asm-ia64/thread_info.h | 1 +
7 files changed, 52 insertions(+), 5 deletions(-)
diff --git a/arch/ia64/sn/kernel/setup.c b/arch/ia64/sn/kernel/setup.c
--- a/arch/ia64/sn/kernel/setup.c
+++ b/arch/ia64/sn/kernel/setup.c
@@ -3,7 +3,7 @@
* License. See the file "COPYING" in the main directory of this archive
* for more details.
*
- * Copyright (C) 1999,2001-2005 Silicon Graphics, Inc. All rights reserved.
+ * Copyright (C) 1999,2001-2006 Silicon Graphics, Inc. All rights reserved.
*/
#include <linux/config.h>
@@ -496,6 +496,7 @@ void __init sn_setup(char **cmdline_p)
* for sn.
*/
pm_power_off = ia64_sn_power_down;
+ current->thread.flags |= IA64_THREAD_MIGRATION;
}
/**
@@ -654,7 +655,8 @@ void __init sn_cpu_init(void)
SH2_PIO_WRITE_STATUS_1, SH2_PIO_WRITE_STATUS_3};
u64 *pio;
pio = is_shub1() ? pio1 : pio2;
- pda->pio_write_status_addr = (volatile unsigned long *) LOCAL_MMR_ADDR(pio[slice]);
+ pda->pio_write_status_addr =
+ (volatile unsigned long *)GLOBAL_MMR_ADDR(nasid, pio[slice]);
pda->pio_write_status_val = is_shub1() ? SH_PIO_WRITE_STATUS_PENDING_WRITE_COUNT_MASK : 0;
}
diff --git a/arch/ia64/sn/kernel/sn2/sn2_smp.c b/arch/ia64/sn/kernel/sn2/sn2_smp.c
--- a/arch/ia64/sn/kernel/sn2/sn2_smp.c
+++ b/arch/ia64/sn/kernel/sn2/sn2_smp.c
@@ -5,7 +5,7 @@
* License. See the file "COPYING" in the main directory of this archive
* for more details.
*
- * Copyright (C) 2000-2005 Silicon Graphics, Inc. All rights reserved.
+ * Copyright (C) 2000-2006 Silicon Graphics, Inc. All rights reserved.
*/
#include <linux/init.h>
@@ -169,6 +169,27 @@ static inline unsigned long wait_piowc(v
return ws;
}
+/**
+ * sn_migrate - SN-specific task migration actions
+ * @task: Task being migrated to new CPU
+ *
+ * SN2 PIO writes from separate CPUs are not guaranteed to arrive in order.
+ * Context switching user threads which have memory-mapped MMIO may cause
+ * PIOs to issue from seperate CPUs, thus the PIO writes must be drained
+ * from the previous CPU's Shub before execution resumes on the new CPU.
+ */
+void sn_migrate(struct task_struct *task)
+{
+ pda_t *last_pda = pdacpu(task_thread_info(task)->last_cpu);
+ volatile unsigned long *adr = last_pda->pio_write_status_addr;
+ unsigned long val = last_pda->pio_write_status_val;
+
+ /* Drain PIO writes from old CPU's Shub */
+ while (unlikely((*adr & SH_PIO_WRITE_STATUS_PENDING_WRITE_COUNT_MASK)
+ != val))
+ cpu_relax();
+}
+
void sn_tlb_migrate_finish(struct mm_struct *mm)
{
if (mm == current->mm)
diff --git a/include/asm-ia64/machvec.h b/include/asm-ia64/machvec.h
--- a/include/asm-ia64/machvec.h
+++ b/include/asm-ia64/machvec.h
@@ -20,6 +20,7 @@ struct scatterlist;
struct page;
struct mm_struct;
struct pci_bus;
+struct task_struct;
typedef void ia64_mv_setup_t (char **);
typedef void ia64_mv_cpu_init_t (void);
@@ -34,6 +35,7 @@ typedef int ia64_mv_pci_legacy_read_t (s
u8 size);
typedef int ia64_mv_pci_legacy_write_t (struct pci_bus *, u16 port, u32 val,
u8 size);
+typedef void ia64_mv_migrate_t(struct task_struct * task);
/* DMA-mapping interface: */
typedef void ia64_mv_dma_init (void);
@@ -85,6 +87,11 @@ machvec_noop_mm (struct mm_struct *mm)
{
}
+static inline void
+machvec_noop_task (struct task_struct *task)
+{
+}
+
extern void machvec_setup (char **);
extern void machvec_timer_interrupt (int, void *, struct pt_regs *);
extern void machvec_dma_sync_single (struct device *, dma_addr_t, size_t, int);
@@ -146,6 +153,7 @@ extern void machvec_tlb_migrate_finish (
# define platform_readw_relaxed ia64_mv.readw_relaxed
# define platform_readl_relaxed ia64_mv.readl_relaxed
# define platform_readq_relaxed ia64_mv.readq_relaxed
+# define platform_migrate ia64_mv.migrate
# endif
/* __attribute__((__aligned__(16))) is required to make size of the
@@ -194,6 +202,7 @@ struct ia64_machine_vector {
ia64_mv_readw_relaxed_t *readw_relaxed;
ia64_mv_readl_relaxed_t *readl_relaxed;
ia64_mv_readq_relaxed_t *readq_relaxed;
+ ia64_mv_migrate_t *migrate;
} __attribute__((__aligned__(16))); /* align attrib? see above comment */
#define MACHVEC_INIT(name) \
@@ -238,6 +247,7 @@ struct ia64_machine_vector {
platform_readw_relaxed, \
platform_readl_relaxed, \
platform_readq_relaxed, \
+ platform_migrate, \
}
extern struct ia64_machine_vector ia64_mv;
@@ -386,5 +396,8 @@ extern ia64_mv_dma_supported swiotlb_dm
#ifndef platform_readq_relaxed
# define platform_readq_relaxed __ia64_readq_relaxed
#endif
+#ifndef platform_migrate
+# define platform_migrate machvec_noop_task
+#endif
#endif /* _ASM_IA64_MACHVEC_H */
diff --git a/include/asm-ia64/machvec_sn2.h b/include/asm-ia64/machvec_sn2.h
--- a/include/asm-ia64/machvec_sn2.h
+++ b/include/asm-ia64/machvec_sn2.h
@@ -1,5 +1,5 @@
/*
- * Copyright (c) 2002-2003 Silicon Graphics, Inc. All Rights Reserved.
+ * Copyright (c) 2002-2003,2006 Silicon Graphics, Inc. All Rights Reserved.
*
* This program is free software; you can redistribute it and/or modify it
* under the terms of version 2 of the GNU General Public License
@@ -71,6 +71,7 @@ extern ia64_mv_dma_sync_single_for_devic
extern ia64_mv_dma_sync_sg_for_device sn_dma_sync_sg_for_device;
extern ia64_mv_dma_mapping_error sn_dma_mapping_error;
extern ia64_mv_dma_supported sn_dma_supported;
+extern ia64_mv_migrate_t sn_migrate;
/*
* This stuff has dual use!
@@ -120,6 +121,7 @@ extern ia64_mv_dma_supported sn_dma_sup
#define platform_dma_sync_sg_for_device sn_dma_sync_sg_for_device
#define platform_dma_mapping_error sn_dma_mapping_error
#define platform_dma_supported sn_dma_supported
+#define platform_migrate sn_migrate
#include <asm/sn/io.h>
diff --git a/include/asm-ia64/processor.h b/include/asm-ia64/processor.h
--- a/include/asm-ia64/processor.h
+++ b/include/asm-ia64/processor.h
@@ -50,7 +50,8 @@
#define IA64_THREAD_PM_VALID (__IA64_UL(1) << 2) /* performance registers valid? */
#define IA64_THREAD_UAC_NOPRINT (__IA64_UL(1) << 3) /* don't log unaligned accesses */
#define IA64_THREAD_UAC_SIGBUS (__IA64_UL(1) << 4) /* generate SIGBUS on unaligned acc. */
- /* bit 5 is currently unused */
+#define IA64_THREAD_MIGRATION (__IA64_UL(1) << 5) /* require migration
+ sync at ctx sw */
#define IA64_THREAD_FPEMU_NOPRINT (__IA64_UL(1) << 6) /* don't log any fpswa faults */
#define IA64_THREAD_FPEMU_SIGFPE (__IA64_UL(1) << 7) /* send a SIGFPE for fpswa faults */
diff --git a/include/asm-ia64/system.h b/include/asm-ia64/system.h
--- a/include/asm-ia64/system.h
+++ b/include/asm-ia64/system.h
@@ -244,6 +244,13 @@ extern void ia64_load_extra (struct task
__ia64_save_fpu((prev)->thread.fph); \
} \
__switch_to(prev, next, last); \
+ /* "next" in old context is "current" in new context */ \
+ if (unlikely((current->thread.flags & IA64_THREAD_MIGRATION) && \
+ (task_cpu(current) != \
+ task_thread_info(current)->last_cpu))) { \
+ platform_migrate(current); \
+ task_thread_info(current)->last_cpu = task_cpu(current); \
+ } \
} while (0)
#else
# define switch_to(prev,next,last) __switch_to(prev, next, last)
diff --git a/include/asm-ia64/thread_info.h b/include/asm-ia64/thread_info.h
--- a/include/asm-ia64/thread_info.h
+++ b/include/asm-ia64/thread_info.h
@@ -26,6 +26,7 @@ struct thread_info {
struct exec_domain *exec_domain;/* execution domain */
__u32 flags; /* thread_info flags (see TIF_*) */
__u32 cpu; /* current CPU */
+ __u32 last_cpu; /* Last CPU thread ran on */
mm_segment_t addr_limit; /* user-level address space limit */
int preempt_count; /* 0=premptable, <0=BUG; will also serve as bh-counter */
struct restart_block restart_block;
^ permalink raw reply [flat|nested] 73+ messages in thread* RE: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (62 preceding siblings ...)
2006-01-26 23:44 ` Prarit Bhargava
@ 2006-01-27 0:07 ` Chen, Kenneth W
2006-01-27 14:01 ` Prarit Bhargava
64 siblings, 0 replies; 73+ messages in thread
From: Chen, Kenneth W @ 2006-01-27 0:07 UTC (permalink / raw)
To: linux-ia64
Prarit Bhargava wrote on Thursday, January 26, 2006 3:44 PM
> diff --git a/include/asm-ia64/processor.h b/include/asm-ia64/processor.h
> --- a/include/asm-ia64/processor.h
> +++ b/include/asm-ia64/processor.h
> @@ -50,7 +50,8 @@
> #define IA64_THREAD_PM_VALID (__IA64_UL(1) << 2) /* performance registers valid? */
> #define IA64_THREAD_UAC_NOPRINT (__IA64_UL(1) << 3) /* don't log unaligned accesses */
> #define IA64_THREAD_UAC_SIGBUS (__IA64_UL(1) << 4) /* generate SIGBUS on unaligned acc. */
> - /* bit 5 is currently unused */
> +#define IA64_THREAD_MIGRATION (__IA64_UL(1) << 5) /* require migration
> + sync at ctx sw */
> #define IA64_THREAD_FPEMU_NOPRINT (__IA64_UL(1) << 6) /* don't log any fpswa faults */
> #define IA64_THREAD_FPEMU_SIGFPE (__IA64_UL(1) << 7) /* send a SIGFPE for fpswa faults */
>
> diff --git a/include/asm-ia64/system.h b/include/asm-ia64/system.h
> --- a/include/asm-ia64/system.h
> +++ b/include/asm-ia64/system.h
> @@ -244,6 +244,13 @@ extern void ia64_load_extra (struct task
> __ia64_save_fpu((prev)->thread.fph); \
> } \
> __switch_to(prev, next, last); \
> + /* "next" in old context is "current" in new context */ \
> + if (unlikely((current->thread.flags & IA64_THREAD_MIGRATION) && \
> + (task_cpu(current) != \
> + task_thread_info(current)->last_cpu))) { \
> + platform_migrate(current); \
> + task_thread_info(current)->last_cpu = task_cpu(current); \
> + } \
> } while (0)
> #else
> # define switch_to(prev,next,last) __switch_to(prev, next, last)
80-column religious sometime is just pure annoying. I personally find
this looked worse than having the lines spilled over 80 column, in these
two particular instances.
- Ken
^ permalink raw reply [flat|nested] 73+ messages in thread* Re: [PATCH] SN2 user-MMIO CPU migration
2006-01-20 0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
` (63 preceding siblings ...)
2006-01-27 0:07 ` Chen, Kenneth W
@ 2006-01-27 14:01 ` Prarit Bhargava
64 siblings, 0 replies; 73+ messages in thread
From: Prarit Bhargava @ 2006-01-27 14:01 UTC (permalink / raw)
To: linux-ia64
>
> 80-column religious sometime is just pure annoying. I personally find
> this looked worse than having the lines spilled over 80 column, in these
> two particular instances.
>
> - Ken
Ken,
I understand your concern, however, Documentation/CodingStyle is
explicit about this -- do not exceed 80 columns.
The problem here really is that the code already exceeds 80 columns and
we're (in a sense) breaking it even further by continuing to exceed 80
columns.
P.
^ permalink raw reply [flat|nested] 73+ messages in thread