[PATCH] Allow preemption during lazy mmu updates

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] Allow preemption during lazy mmu updates
@ 2009-03-27 18:02 Jeremy Fitzhardinge
  2009-03-27 18:02 ` [PATCH 1/8] mm: disable preemption in apply_to_pte_range Jeremy Fitzhardinge
                   ` (9 more replies)
  0 siblings, 10 replies; 30+ messages in thread
From: Jeremy Fitzhardinge @ 2009-03-27 18:02 UTC (permalink / raw)
  To: Andrew Morton
  Cc: the arch/x86 maintainers, Ingo Molnar, Linux Kernel Mailing List,
	Peter Zijlstra, Nick Piggin, Thomas Gleixner

Hi all,

We discussed this series a while ago.  The specific problem
was the need to disable preemption in apply_to_pte_range when using
lazy mmu updates around the callback function.  When used on usermode
addresses there was no problem because it needs to take the pte
lock anyway, but there's no requirement for taking a pte lock when
updating kernel ptes, so it ended up adding a new no-preempt region.

The gist of the series is that if we get preempted while doing an mmu
update, we flush all the pending updates and switch to the next task.
We record that the task was doing a lazy mmu update in its task flags,
and resume lazy updates when we switch back.

All the context-switch time activity happens in the existing
context-switch pvops calls, so there's no cost to non-pvops systems,
or to pvops backends which don't use lazy mmu updates.

I don't think there were any objections to this series, but Ingo would
like to see an Acked by from someone since it gets into the mm side
of things.

(The first patch in the series adds the required preempt disable/enable
and then the rest of the series removes them again.  I think the first
patch is already in mm-.)

Thanks,
	J

 arch/x86/include/asm/paravirt.h    |   22 +++++++-------
 arch/x86/include/asm/pgtable.h     |    2 +
 arch/x86/include/asm/thread_info.h |    2 +
 arch/x86/kernel/kvm.c              |    2 -
 arch/x86/kernel/paravirt.c         |   56 +++++++++++++++++--------------------
 arch/x86/kernel/process_32.c       |    2 -
 arch/x86/kernel/process_64.c       |    2 -
 arch/x86/kernel/vmi_32.c           |   20 ++++++++-----
 arch/x86/lguest/boot.c             |   16 +++++++---
 arch/x86/mm/fault.c                |    6 +--
 arch/x86/mm/highmem_32.c           |    2 -
 arch/x86/mm/iomap_32.c             |    2 -
 arch/x86/mm/pageattr.c             |   14 ---------
 arch/x86/xen/enlighten.c           |   10 ++----
 arch/x86/xen/mmu.c                 |   20 +++++--------
 arch/x86/xen/xen-ops.h             |    1 
 include/asm-frv/pgtable.h          |    4 +-
 include/asm-generic/pgtable.h      |   21 +++++++------
 kernel/sched.c                     |    2 -
 19 files changed, 96 insertions(+), 110 deletions(-)

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 1/8] mm: disable preemption in apply_to_pte_range
  2009-03-27 18:02 [PATCH] Allow preemption during lazy mmu updates Jeremy Fitzhardinge
@ 2009-03-27 18:02 ` Jeremy Fitzhardinge
  2009-03-27 18:02 ` [PATCH 2/8] x86/paravirt: remove lazy mode in interrupts Jeremy Fitzhardinge
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 30+ messages in thread
From: Jeremy Fitzhardinge @ 2009-03-27 18:02 UTC (permalink / raw)
  To: Andrew Morton
  Cc: the arch/x86 maintainers, Ingo Molnar, Linux Kernel Mailing List,
	Peter Zijlstra, Nick Piggin, Thomas Gleixner, Jeremy Fitzhardinge

From: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

Impact: bugfix

Lazy mmu mode needs preemption disabled, so if we're apply to
init_mm (which doesn't require any pte locks), then explicitly
disable preemption.  (Do it unconditionally after checking we've
successfully done the allocation to simplify the error handling.)

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
---
 mm/memory.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index ef11ac6..27f8677 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1722,6 +1722,7 @@ static int apply_to_pte_range(struct mm_struct *mm, pmd_t *pmd,
 
 	BUG_ON(pmd_huge(*pmd));
 
+	preempt_disable();
 	arch_enter_lazy_mmu_mode();
 
 	token = pmd_pgtable(*pmd);
@@ -1733,6 +1734,7 @@ static int apply_to_pte_range(struct mm_struct *mm, pmd_t *pmd,
 	} while (pte++, addr += PAGE_SIZE, addr != end);
 
 	arch_leave_lazy_mmu_mode();
+	preempt_enable();
 
 	if (mm != &init_mm)
 		pte_unmap_unlock(pte-1, ptl);
-- 
1.6.0.6


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 2/8] x86/paravirt: remove lazy mode in interrupts
  2009-03-27 18:02 [PATCH] Allow preemption during lazy mmu updates Jeremy Fitzhardinge
  2009-03-27 18:02 ` [PATCH 1/8] mm: disable preemption in apply_to_pte_range Jeremy Fitzhardinge
@ 2009-03-27 18:02 ` Jeremy Fitzhardinge
  2009-03-27 18:02 ` [PATCH 3/8] x86/pvops: replace arch_enter_lazy_cpu_mode with arch_start_context_switch Jeremy Fitzhardinge
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 30+ messages in thread
From: Jeremy Fitzhardinge @ 2009-03-27 18:02 UTC (permalink / raw)
  To: Andrew Morton
  Cc: the arch/x86 maintainers, Ingo Molnar, Linux Kernel Mailing List,
	Peter Zijlstra, Nick Piggin, Thomas Gleixner, Jeremy Fitzhardinge

From: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

Impact: simplification, robustness

Make paravirt_lazy_mode() always return PARAVIRT_LAZY_NONE
when in an interrupt.  This prevents interrupt code from
accidentally inheriting an outer lazy state, and instead
does everything synchronously.  Outer batched operations
are left deferred.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
---
 arch/x86/kernel/paravirt.c |    3 +++
 arch/x86/mm/fault.c        |    6 ++----
 arch/x86/mm/highmem_32.c   |    2 --
 arch/x86/mm/iomap_32.c     |    2 --
 arch/x86/mm/pageattr.c     |   14 --------------
 5 files changed, 5 insertions(+), 22 deletions(-)

diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
index 8e45f44..c866521 100644
--- a/arch/x86/kernel/paravirt.c
+++ b/arch/x86/kernel/paravirt.c
@@ -282,6 +282,9 @@ void paravirt_leave_lazy_cpu(void)
 
 enum paravirt_lazy_mode paravirt_get_lazy_mode(void)
 {
+	if (in_interrupt())
+		return PARAVIRT_LAZY_NONE;
+
 	return __get_cpu_var(paravirt_lazy_mode);
 }
 
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index f70b901..09e6ae4 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -227,12 +227,10 @@ static inline pmd_t *vmalloc_sync_one(pgd_t *pgd, unsigned long address)
 	if (!pmd_present(*pmd_k))
 		return NULL;
 
-	if (!pmd_present(*pmd)) {
+	if (!pmd_present(*pmd))
 		set_pmd(pmd, *pmd_k);
-		arch_flush_lazy_mmu_mode();
-	} else {
+	else
 		BUG_ON(pmd_page(*pmd) != pmd_page(*pmd_k));
-	}
 
 	return pmd_k;
 }
diff --git a/arch/x86/mm/highmem_32.c b/arch/x86/mm/highmem_32.c
index 522db5e..17d0103 100644
--- a/arch/x86/mm/highmem_32.c
+++ b/arch/x86/mm/highmem_32.c
@@ -87,7 +87,6 @@ void *kmap_atomic_prot(struct page *page, enum km_type type, pgprot_t prot)
 	vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx);
 	BUG_ON(!pte_none(*(kmap_pte-idx)));
 	set_pte(kmap_pte-idx, mk_pte(page, prot));
-	arch_flush_lazy_mmu_mode();
 
 	return (void *)vaddr;
 }
@@ -117,7 +116,6 @@ void kunmap_atomic(void *kvaddr, enum km_type type)
 #endif
 	}
 
-	arch_flush_lazy_mmu_mode();
 	pagefault_enable();
 }
 
diff --git a/arch/x86/mm/iomap_32.c b/arch/x86/mm/iomap_32.c
index 699c9b2..0c16a33 100644
--- a/arch/x86/mm/iomap_32.c
+++ b/arch/x86/mm/iomap_32.c
@@ -41,7 +41,6 @@ void *kmap_atomic_prot_pfn(unsigned long pfn, enum km_type type, pgprot_t prot)
 	idx = type + KM_TYPE_NR * smp_processor_id();
 	vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx);
 	set_pte(kmap_pte - idx, pfn_pte(pfn, prot));
-	arch_flush_lazy_mmu_mode();
 
 	return (void *)vaddr;
 }
@@ -80,7 +79,6 @@ iounmap_atomic(void *kvaddr, enum km_type type)
 	if (vaddr == __fix_to_virt(FIX_KMAP_BEGIN+idx))
 		kpte_clear_flush(kmap_pte-idx, vaddr);
 
-	arch_flush_lazy_mmu_mode();
 	pagefault_enable();
 }
 EXPORT_SYMBOL_GPL(iounmap_atomic);
diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index b0e5adb..1224865 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -844,13 +844,6 @@ static int change_page_attr_set_clr(unsigned long *addr, int numpages,
 
 	vm_unmap_aliases();
 
-	/*
-	 * If we're called with lazy mmu updates enabled, the
-	 * in-memory pte state may be stale.  Flush pending updates to
-	 * bring them up to date.
-	 */
-	arch_flush_lazy_mmu_mode();
-
 	cpa.vaddr = addr;
 	cpa.pages = pages;
 	cpa.numpages = numpages;
@@ -895,13 +888,6 @@ static int change_page_attr_set_clr(unsigned long *addr, int numpages,
 	} else
 		cpa_flush_all(cache);
 
-	/*
-	 * If we've been called with lazy mmu updates enabled, then
-	 * make sure that everything gets flushed out before we
-	 * return.
-	 */
-	arch_flush_lazy_mmu_mode();
-
 out:
 	return ret;
 }
-- 
1.6.0.6


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 3/8] x86/pvops: replace arch_enter_lazy_cpu_mode with arch_start_context_switch
  2009-03-27 18:02 [PATCH] Allow preemption during lazy mmu updates Jeremy Fitzhardinge
  2009-03-27 18:02 ` [PATCH 1/8] mm: disable preemption in apply_to_pte_range Jeremy Fitzhardinge
  2009-03-27 18:02 ` [PATCH 2/8] x86/paravirt: remove lazy mode in interrupts Jeremy Fitzhardinge
@ 2009-03-27 18:02 ` Jeremy Fitzhardinge
  2009-03-27 18:02 ` [PATCH 4/8] x86/paravirt: flush pending mmu updates on context switch Jeremy Fitzhardinge
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 30+ messages in thread
From: Jeremy Fitzhardinge @ 2009-03-27 18:02 UTC (permalink / raw)
  To: Andrew Morton
  Cc: the arch/x86 maintainers, Ingo Molnar, Linux Kernel Mailing List,
	Peter Zijlstra, Nick Piggin, Thomas Gleixner, Jeremy Fitzhardinge

From: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

Impact: simplification, prepare for later changes

Make lazy cpu mode more specific to context switching, so that
it makes sense to do more context-switch specific things in
the callbacks.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
---
 arch/x86/include/asm/paravirt.h |    8 +++-----
 arch/x86/kernel/paravirt.c      |   13 -------------
 arch/x86/kernel/process_32.c    |    2 +-
 arch/x86/kernel/process_64.c    |    2 +-
 arch/x86/xen/mmu.c              |    5 +----
 include/asm-frv/pgtable.h       |    4 ++--
 include/asm-generic/pgtable.h   |   21 +++++++++++----------
 kernel/sched.c                  |    2 +-
 8 files changed, 20 insertions(+), 37 deletions(-)

diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
index 7727aa8..f79dc89 100644
--- a/arch/x86/include/asm/paravirt.h
+++ b/arch/x86/include/asm/paravirt.h
@@ -1405,19 +1405,17 @@ void paravirt_enter_lazy_mmu(void);
 void paravirt_leave_lazy_mmu(void);
 void paravirt_leave_lazy(enum paravirt_lazy_mode mode);
 
-#define  __HAVE_ARCH_ENTER_LAZY_CPU_MODE
-static inline void arch_enter_lazy_cpu_mode(void)
+#define  __HAVE_ARCH_START_CONTEXT_SWITCH
+static inline void arch_start_context_switch(void)
 {
 	PVOP_VCALL0(pv_cpu_ops.lazy_mode.enter);
 }
 
-static inline void arch_leave_lazy_cpu_mode(void)
+static inline void arch_end_context_switch(void)
 {
 	PVOP_VCALL0(pv_cpu_ops.lazy_mode.leave);
 }
 
-void arch_flush_lazy_cpu_mode(void);
-
 #define  __HAVE_ARCH_ENTER_LAZY_MMU_MODE
 static inline void arch_enter_lazy_mmu_mode(void)
 {
diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
index c866521..a53e4fb 100644
--- a/arch/x86/kernel/paravirt.c
+++ b/arch/x86/kernel/paravirt.c
@@ -301,19 +301,6 @@ void arch_flush_lazy_mmu_mode(void)
 	preempt_enable();
 }
 
-void arch_flush_lazy_cpu_mode(void)
-{
-	preempt_disable();
-
-	if (paravirt_get_lazy_mode() == PARAVIRT_LAZY_CPU) {
-		WARN_ON(preempt_count() == 1);
-		arch_leave_lazy_cpu_mode();
-		arch_enter_lazy_cpu_mode();
-	}
-
-	preempt_enable();
-}
-
 struct pv_info pv_info = {
 	.name = "bare hardware",
 	.paravirt_enabled = 0,
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index 14014d7..57e49a8 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -407,7 +407,7 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
 	 * done before math_state_restore, so the TS bit is up
 	 * to date.
 	 */
-	arch_leave_lazy_cpu_mode();
+	arch_end_context_switch();
 
 	/* If the task has used fpu the last 5 timeslices, just do a full
 	 * restore of the math state immediately to avoid the trap; the
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index abb7e6a..7115e60 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -428,7 +428,7 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
 	 * done before math_state_restore, so the TS bit is up
 	 * to date.
 	 */
-	arch_leave_lazy_cpu_mode();
+	arch_end_context_switch();
 
 	/*
 	 * Switch FS and GS.
diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
index db3802f..4702c79 100644
--- a/arch/x86/xen/mmu.c
+++ b/arch/x86/xen/mmu.c
@@ -1119,10 +1119,8 @@ static void drop_other_mm_ref(void *info)
 
 	/* If this cpu still has a stale cr3 reference, then make sure
 	   it has been flushed. */
-	if (percpu_read(xen_current_cr3) == __pa(mm->pgd)) {
+	if (percpu_read(xen_current_cr3) == __pa(mm->pgd))
 		load_cr3(swapper_pg_dir);
-		arch_flush_lazy_cpu_mode();
-	}
 }
 
 static void xen_drop_mm_ref(struct mm_struct *mm)
@@ -1135,7 +1133,6 @@ static void xen_drop_mm_ref(struct mm_struct *mm)
 			load_cr3(swapper_pg_dir);
 		else
 			leave_mm(smp_processor_id());
-		arch_flush_lazy_cpu_mode();
 	}
 
 	/* Get the "official" set of cpus referring to our pagetable. */
diff --git a/include/asm-frv/pgtable.h b/include/asm-frv/pgtable.h
index e16fdb1..235e34a 100644
--- a/include/asm-frv/pgtable.h
+++ b/include/asm-frv/pgtable.h
@@ -73,8 +73,8 @@ static inline int pte_file(pte_t pte) { return 0; }
 #define pgtable_cache_init()		do {} while (0)
 #define arch_enter_lazy_mmu_mode()	do {} while (0)
 #define arch_leave_lazy_mmu_mode()	do {} while (0)
-#define arch_enter_lazy_cpu_mode()	do {} while (0)
-#define arch_leave_lazy_cpu_mode()	do {} while (0)
+
+#define arch_start_context_switch()	do {} while (0)
 
 #else /* !CONFIG_MMU */
 /*****************************************************************************/
diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index 8e6d0ca..922f036 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -280,17 +280,18 @@ static inline void ptep_modify_prot_commit(struct mm_struct *mm,
 #endif
 
 /*
- * A facility to provide batching of the reload of page tables with the
- * actual context switch code for paravirtualized guests.  By convention,
- * only one of the lazy modes (CPU, MMU) should be active at any given
- * time, entry should never be nested, and entry and exits should always
- * be paired.  This is for sanity of maintaining and reasoning about the
- * kernel code.
+ * A facility to provide batching of the reload of page tables and
+ * other process state with the actual context switch code for
+ * paravirtualized guests.  By convention, only one of the batched
+ * update (lazy) modes (CPU, MMU) should be active at any given time,
+ * entry should never be nested, and entry and exits should always be
+ * paired.  This is for sanity of maintaining and reasoning about the
+ * kernel code.  In this case, the exit (end of the context switch) is
+ * in architecture-specific code, and so doesn't need a generic
+ * definition.
  */
-#ifndef __HAVE_ARCH_ENTER_LAZY_CPU_MODE
-#define arch_enter_lazy_cpu_mode()	do {} while (0)
-#define arch_leave_lazy_cpu_mode()	do {} while (0)
-#define arch_flush_lazy_cpu_mode()	do {} while (0)
+#ifndef __HAVE_ARCH_START_CONTEXT_SWITCH
+#define arch_start_context_switch()	do {} while (0)
 #endif
 
 #ifndef __HAVE_PFNMAP_TRACKING
diff --git a/kernel/sched.c b/kernel/sched.c
index 3e827b8..168adaf 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -2804,7 +2804,7 @@ context_switch(struct rq *rq, struct task_struct *prev,
 	 * combine the page table reload and the switch backend into
 	 * one hypercall.
 	 */
-	arch_enter_lazy_cpu_mode();
+	arch_start_context_switch();
 
 	if (unlikely(!mm)) {
 		next->active_mm = oldmm;
-- 
1.6.0.6


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 4/8] x86/paravirt: flush pending mmu updates on context switch
  2009-03-27 18:02 [PATCH] Allow preemption during lazy mmu updates Jeremy Fitzhardinge
                   ` (2 preceding siblings ...)
  2009-03-27 18:02 ` [PATCH 3/8] x86/pvops: replace arch_enter_lazy_cpu_mode with arch_start_context_switch Jeremy Fitzhardinge
@ 2009-03-27 18:02 ` Jeremy Fitzhardinge
  2009-03-27 18:02 ` [PATCH 5/8] x86/paravirt: finish change from lazy cpu to context switch start/end Jeremy Fitzhardinge
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 30+ messages in thread
From: Jeremy Fitzhardinge @ 2009-03-27 18:02 UTC (permalink / raw)
  To: Andrew Morton
  Cc: the arch/x86 maintainers, Ingo Molnar, Linux Kernel Mailing List,
	Peter Zijlstra, Nick Piggin, Thomas Gleixner, Jeremy Fitzhardinge

From: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

Impact: allow preemption during lazy mmu updates

If we're in lazy mmu mode when context switching, leave
lazy mmu mode, but remember the task's state in
TIF_LAZY_MMU_UPDATES.  When we resume the task, check this
flag and re-enter lazy mmu mode if its set.

This sets things up for allowing lazy mmu mode while preemptible,
though that won't actually be active until the next change.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
---
 arch/x86/include/asm/paravirt.h    |    1 -
 arch/x86/include/asm/thread_info.h |    2 ++
 arch/x86/kernel/kvm.c              |    2 +-
 arch/x86/kernel/paravirt.c         |   13 ++++++++++---
 arch/x86/kernel/vmi_32.c           |   14 ++++++++++----
 arch/x86/lguest/boot.c             |   14 ++++++++++----
 arch/x86/xen/enlighten.c           |    6 +++---
 arch/x86/xen/mmu.c                 |    7 ++++++-
 arch/x86/xen/xen-ops.h             |    1 -
 9 files changed, 42 insertions(+), 18 deletions(-)

diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
index f79dc89..5ecd16e 100644
--- a/arch/x86/include/asm/paravirt.h
+++ b/arch/x86/include/asm/paravirt.h
@@ -1403,7 +1403,6 @@ void paravirt_enter_lazy_cpu(void);
 void paravirt_leave_lazy_cpu(void);
 void paravirt_enter_lazy_mmu(void);
 void paravirt_leave_lazy_mmu(void);
-void paravirt_leave_lazy(enum paravirt_lazy_mode mode);
 
 #define  __HAVE_ARCH_START_CONTEXT_SWITCH
 static inline void arch_start_context_switch(void)
diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h
index 83d2b73..e6b4beb 100644
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -96,6 +96,7 @@ struct thread_info {
 #define TIF_DEBUGCTLMSR		25	/* uses thread_struct.debugctlmsr */
 #define TIF_DS_AREA_MSR		26      /* uses thread_struct.ds_area_msr */
 #define TIF_SYSCALL_FTRACE	27	/* for ftrace syscall instrumentation */
+#define TIF_LAZY_MMU_UPDATES	28	/* task is updating the mmu lazily */
 
 #define _TIF_SYSCALL_TRACE	(1 << TIF_SYSCALL_TRACE)
 #define _TIF_NOTIFY_RESUME	(1 << TIF_NOTIFY_RESUME)
@@ -119,6 +120,7 @@ struct thread_info {
 #define _TIF_DEBUGCTLMSR	(1 << TIF_DEBUGCTLMSR)
 #define _TIF_DS_AREA_MSR	(1 << TIF_DS_AREA_MSR)
 #define _TIF_SYSCALL_FTRACE	(1 << TIF_SYSCALL_FTRACE)
+#define _TIF_LAZY_MMU_UPDATES	(1 << TIF_LAZY_MMU_UPDATES)
 
 /* work to do in syscall_trace_enter() */
 #define _TIF_WORK_SYSCALL_ENTRY	\
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 33019dd..6551ded 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -195,7 +195,7 @@ static void kvm_leave_lazy_mmu(void)
 	struct kvm_para_state *state = kvm_para_state();
 
 	mmu_queue_flush(state);
-	paravirt_leave_lazy(paravirt_get_lazy_mode());
+	paravirt_leave_lazy_mmu();
 	state->mode = paravirt_get_lazy_mode();
 }
 
diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
index a53e4fb..eca40de 100644
--- a/arch/x86/kernel/paravirt.c
+++ b/arch/x86/kernel/paravirt.c
@@ -252,7 +252,7 @@ static inline void enter_lazy(enum paravirt_lazy_mode mode)
 	__get_cpu_var(paravirt_lazy_mode) = mode;
 }
 
-void paravirt_leave_lazy(enum paravirt_lazy_mode mode)
+static void leave_lazy(enum paravirt_lazy_mode mode)
 {
 	BUG_ON(__get_cpu_var(paravirt_lazy_mode) != mode);
 	BUG_ON(preemptible());
@@ -267,17 +267,24 @@ void paravirt_enter_lazy_mmu(void)
 
 void paravirt_leave_lazy_mmu(void)
 {
-	paravirt_leave_lazy(PARAVIRT_LAZY_MMU);
+	leave_lazy(PARAVIRT_LAZY_MMU);
 }
 
 void paravirt_enter_lazy_cpu(void)
 {
+	if (percpu_read(paravirt_lazy_mode) == PARAVIRT_LAZY_MMU) {
+		arch_leave_lazy_mmu_mode();
+		set_thread_flag(TIF_LAZY_MMU_UPDATES);
+	}
 	enter_lazy(PARAVIRT_LAZY_CPU);
 }
 
 void paravirt_leave_lazy_cpu(void)
 {
-	paravirt_leave_lazy(PARAVIRT_LAZY_CPU);
+	leave_lazy(PARAVIRT_LAZY_CPU);
+
+	if (test_and_clear_thread_flag(TIF_LAZY_MMU_UPDATES))
+		arch_enter_lazy_mmu_mode();
 }
 
 enum paravirt_lazy_mode paravirt_get_lazy_mode(void)
diff --git a/arch/x86/kernel/vmi_32.c b/arch/x86/kernel/vmi_32.c
index 95deb9f..d74122f 100644
--- a/arch/x86/kernel/vmi_32.c
+++ b/arch/x86/kernel/vmi_32.c
@@ -468,16 +468,22 @@ static void vmi_enter_lazy_cpu(void)
 	vmi_ops.set_lazy_mode(2);
 }
 
+static void vmi_leave_lazy_cpu(void)
+{
+	vmi_ops.set_lazy_mode(0);
+	paravirt_leave_lazy_cpu();
+}
+
 static void vmi_enter_lazy_mmu(void)
 {
 	paravirt_enter_lazy_mmu();
 	vmi_ops.set_lazy_mode(1);
 }
 
-static void vmi_leave_lazy(void)
+static void vmi_leave_lazy_mmu(void)
 {
-	paravirt_leave_lazy(paravirt_get_lazy_mode());
 	vmi_ops.set_lazy_mode(0);
+	paravirt_leave_lazy_mmu();
 }
 
 static inline int __init check_vmi_rom(struct vrom_header *rom)
@@ -713,12 +719,12 @@ static inline int __init activate_vmi(void)
 
 	para_wrap(pv_cpu_ops.lazy_mode.enter, vmi_enter_lazy_cpu,
 		  set_lazy_mode, SetLazyMode);
-	para_wrap(pv_cpu_ops.lazy_mode.leave, vmi_leave_lazy,
+	para_wrap(pv_cpu_ops.lazy_mode.leave, vmi_leave_lazy_cpu,
 		  set_lazy_mode, SetLazyMode);
 
 	para_wrap(pv_mmu_ops.lazy_mode.enter, vmi_enter_lazy_mmu,
 		  set_lazy_mode, SetLazyMode);
-	para_wrap(pv_mmu_ops.lazy_mode.leave, vmi_leave_lazy,
+	para_wrap(pv_mmu_ops.lazy_mode.leave, vmi_leave_lazy_mmu,
 		  set_lazy_mode, SetLazyMode);
 
 	/* user and kernel flush are just handled with different flags to FlushTLB */
diff --git a/arch/x86/lguest/boot.c b/arch/x86/lguest/boot.c
index 90e44a1..70d412a 100644
--- a/arch/x86/lguest/boot.c
+++ b/arch/x86/lguest/boot.c
@@ -147,10 +147,16 @@ static void lazy_hcall(unsigned long call,
 
 /* When lazy mode is turned off reset the per-cpu lazy mode variable and then
  * issue the do-nothing hypercall to flush any stored calls. */
-static void lguest_leave_lazy_mode(void)
+static void lguest_leave_lazy_mmu_mode(void)
 {
-	paravirt_leave_lazy(paravirt_get_lazy_mode());
 	hcall(LHCALL_FLUSH_ASYNC, 0, 0, 0);
+	paravirt_leave_lazy_mmu();
+}
+
+static void lguest_leave_lazy_cpu_mode(void)
+{
+	hcall(LHCALL_FLUSH_ASYNC, 0, 0, 0);
+	paravirt_leave_lazy_cpu();
 }
 
 /*G:033
@@ -1026,7 +1032,7 @@ __init void lguest_init(void)
 	pv_cpu_ops.write_idt_entry = lguest_write_idt_entry;
 	pv_cpu_ops.wbinvd = lguest_wbinvd;
 	pv_cpu_ops.lazy_mode.enter = paravirt_enter_lazy_cpu;
-	pv_cpu_ops.lazy_mode.leave = lguest_leave_lazy_mode;
+	pv_cpu_ops.lazy_mode.leave = lguest_leave_lazy_cpu_mode;
 
 	/* pagetable management */
 	pv_mmu_ops.write_cr3 = lguest_write_cr3;
@@ -1039,7 +1045,7 @@ __init void lguest_init(void)
 	pv_mmu_ops.read_cr2 = lguest_read_cr2;
 	pv_mmu_ops.read_cr3 = lguest_read_cr3;
 	pv_mmu_ops.lazy_mode.enter = paravirt_enter_lazy_mmu;
-	pv_mmu_ops.lazy_mode.leave = lguest_leave_lazy_mode;
+	pv_mmu_ops.lazy_mode.leave = lguest_leave_lazy_mmu_mode;
 
 #ifdef CONFIG_X86_LOCAL_APIC
 	/* apic read/write intercepts */
diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 82cd39a..f586e63 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -203,10 +203,10 @@ static unsigned long xen_get_debugreg(int reg)
 	return HYPERVISOR_get_debugreg(reg);
 }
 
-void xen_leave_lazy(void)
+static void xen_leave_lazy_cpu(void)
 {
-	paravirt_leave_lazy(paravirt_get_lazy_mode());
 	xen_mc_flush();
+	paravirt_leave_lazy_cpu();
 }
 
 static unsigned long xen_store_tr(void)
@@ -819,7 +819,7 @@ static const struct pv_cpu_ops xen_cpu_ops __initdata = {
 
 	.lazy_mode = {
 		.enter = paravirt_enter_lazy_cpu,
-		.leave = xen_leave_lazy,
+		.leave = xen_leave_lazy_cpu,
 	},
 };
 
diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
index 4702c79..aba3b20 100644
--- a/arch/x86/xen/mmu.c
+++ b/arch/x86/xen/mmu.c
@@ -1816,6 +1816,11 @@ __init void xen_post_allocator_init(void)
 	xen_mark_init_mm_pinned();
 }
 
+static void xen_leave_lazy_mmu(void)
+{
+	xen_mc_flush();
+	paravirt_leave_lazy_mmu();
+}
 
 const struct pv_mmu_ops xen_mmu_ops __initdata = {
 	.pagetable_setup_start = xen_pagetable_setup_start,
@@ -1890,7 +1895,7 @@ const struct pv_mmu_ops xen_mmu_ops __initdata = {
 
 	.lazy_mode = {
 		.enter = paravirt_enter_lazy_mmu,
-		.leave = xen_leave_lazy,
+		.leave = xen_leave_lazy_mmu,
 	},
 
 	.set_fixmap = xen_set_fixmap,
diff --git a/arch/x86/xen/xen-ops.h b/arch/x86/xen/xen-ops.h
index 2f5ef26..f897cdf 100644
--- a/arch/x86/xen/xen-ops.h
+++ b/arch/x86/xen/xen-ops.h
@@ -30,7 +30,6 @@ pgd_t *xen_setup_kernel_pagetable(pgd_t *pgd, unsigned long max_pfn);
 void xen_ident_map_ISA(void);
 void xen_reserve_top(void);
 
-void xen_leave_lazy(void);
 void xen_post_allocator_init(void);
 
 char * __init xen_memory_setup(void);
-- 
1.6.0.6


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 5/8] x86/paravirt: finish change from lazy cpu to context switch start/end
  2009-03-27 18:02 [PATCH] Allow preemption during lazy mmu updates Jeremy Fitzhardinge
                   ` (3 preceding siblings ...)
  2009-03-27 18:02 ` [PATCH 4/8] x86/paravirt: flush pending mmu updates on context switch Jeremy Fitzhardinge
@ 2009-03-27 18:02 ` Jeremy Fitzhardinge
  2009-03-27 18:02 ` [PATCH 6/8] x86/paravirt: allow preemption with lazy mmu mode Jeremy Fitzhardinge
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 30+ messages in thread
From: Jeremy Fitzhardinge @ 2009-03-27 18:02 UTC (permalink / raw)
  To: Andrew Morton
  Cc: the arch/x86 maintainers, Ingo Molnar, Linux Kernel Mailing List,
	Peter Zijlstra, Nick Piggin, Thomas Gleixner, Jeremy Fitzhardinge

From: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

Impact: fix lazy context switch API

Pass the previous and next tasks into the context switch start
end calls, so that the called functions can properly access the
task state (esp in end_context_switch, in which the next task
is not yet completely current).

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
---
 arch/x86/include/asm/paravirt.h |   17 ++++++++++-------
 arch/x86/include/asm/pgtable.h  |    2 ++
 arch/x86/kernel/paravirt.c      |   14 ++++++--------
 arch/x86/kernel/process_32.c    |    2 +-
 arch/x86/kernel/process_64.c    |    2 +-
 arch/x86/kernel/vmi_32.c        |   12 ++++++------
 arch/x86/lguest/boot.c          |    8 ++++----
 arch/x86/xen/enlighten.c        |   10 ++++------
 include/asm-frv/pgtable.h       |    2 +-
 include/asm-generic/pgtable.h   |    2 +-
 kernel/sched.c                  |    2 +-
 11 files changed, 37 insertions(+), 36 deletions(-)

diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
index 5ecd16e..bc384be 100644
--- a/arch/x86/include/asm/paravirt.h
+++ b/arch/x86/include/asm/paravirt.h
@@ -56,6 +56,7 @@ struct desc_ptr;
 struct tss_struct;
 struct mm_struct;
 struct desc_struct;
+struct task_struct;
 
 /*
  * Wrapper type for pointers to code which uses the non-standard
@@ -203,7 +204,8 @@ struct pv_cpu_ops {
 
 	void (*swapgs)(void);
 
-	struct pv_lazy_ops lazy_mode;
+	void (*start_context_switch)(struct task_struct *prev);
+	void (*end_context_switch)(struct task_struct *next);
 };
 
 struct pv_irq_ops {
@@ -1399,20 +1401,21 @@ enum paravirt_lazy_mode {
 };
 
 enum paravirt_lazy_mode paravirt_get_lazy_mode(void);
-void paravirt_enter_lazy_cpu(void);
-void paravirt_leave_lazy_cpu(void);
+void paravirt_start_context_switch(struct task_struct *prev);
+void paravirt_end_context_switch(struct task_struct *next);
+
 void paravirt_enter_lazy_mmu(void);
 void paravirt_leave_lazy_mmu(void);
 
 #define  __HAVE_ARCH_START_CONTEXT_SWITCH
-static inline void arch_start_context_switch(void)
+static inline void arch_start_context_switch(struct task_struct *prev)
 {
-	PVOP_VCALL0(pv_cpu_ops.lazy_mode.enter);
+	PVOP_VCALL1(pv_cpu_ops.start_context_switch, prev);
 }
 
-static inline void arch_end_context_switch(void)
+static inline void arch_end_context_switch(struct task_struct *next)
 {
-	PVOP_VCALL0(pv_cpu_ops.lazy_mode.leave);
+	PVOP_VCALL1(pv_cpu_ops.end_context_switch, next);
 }
 
 #define  __HAVE_ARCH_ENTER_LAZY_MMU_MODE
diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index f5ba6c1..205f6a9 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -81,6 +81,8 @@ static inline void __init paravirt_pagetable_setup_done(pgd_t *base)
 #define pte_val(x)	native_pte_val(x)
 #define __pte(x)	native_make_pte(x)
 
+#define arch_end_context_switch(prev)	do {} while(0)
+
 #endif	/* CONFIG_PARAVIRT */
 
 /*
diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
index eca40de..35eb353 100644
--- a/arch/x86/kernel/paravirt.c
+++ b/arch/x86/kernel/paravirt.c
@@ -270,20 +270,20 @@ void paravirt_leave_lazy_mmu(void)
 	leave_lazy(PARAVIRT_LAZY_MMU);
 }
 
-void paravirt_enter_lazy_cpu(void)
+void paravirt_start_context_switch(struct task_struct *prev)
 {
 	if (percpu_read(paravirt_lazy_mode) == PARAVIRT_LAZY_MMU) {
 		arch_leave_lazy_mmu_mode();
-		set_thread_flag(TIF_LAZY_MMU_UPDATES);
+		set_ti_thread_flag(task_thread_info(prev), TIF_LAZY_MMU_UPDATES);
 	}
 	enter_lazy(PARAVIRT_LAZY_CPU);
 }
 
-void paravirt_leave_lazy_cpu(void)
+void paravirt_end_context_switch(struct task_struct *next)
 {
 	leave_lazy(PARAVIRT_LAZY_CPU);
 
-	if (test_and_clear_thread_flag(TIF_LAZY_MMU_UPDATES))
+	if (test_and_clear_ti_thread_flag(task_thread_info(next), TIF_LAZY_MMU_UPDATES))
 		arch_enter_lazy_mmu_mode();
 }
 
@@ -399,10 +399,8 @@ struct pv_cpu_ops pv_cpu_ops = {
 	.set_iopl_mask = native_set_iopl_mask,
 	.io_delay = native_io_delay,
 
-	.lazy_mode = {
-		.enter = paravirt_nop,
-		.leave = paravirt_nop,
-	},
+	.start_context_switch = paravirt_nop,
+	.end_context_switch = paravirt_nop,
 };
 
 struct pv_apic_ops pv_apic_ops = {
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index 57e49a8..d766c76 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -407,7 +407,7 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
 	 * done before math_state_restore, so the TS bit is up
 	 * to date.
 	 */
-	arch_end_context_switch();
+	arch_end_context_switch(next_p);
 
 	/* If the task has used fpu the last 5 timeslices, just do a full
 	 * restore of the math state immediately to avoid the trap; the
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 7115e60..e8a9aaf 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -428,7 +428,7 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
 	 * done before math_state_restore, so the TS bit is up
 	 * to date.
 	 */
-	arch_end_context_switch();
+	arch_end_context_switch(next_p);
 
 	/*
 	 * Switch FS and GS.
diff --git a/arch/x86/kernel/vmi_32.c b/arch/x86/kernel/vmi_32.c
index d74122f..b263423 100644
--- a/arch/x86/kernel/vmi_32.c
+++ b/arch/x86/kernel/vmi_32.c
@@ -462,16 +462,16 @@ vmi_startup_ipi_hook(int phys_apicid, unsigned long start_eip,
 }
 #endif
 
-static void vmi_enter_lazy_cpu(void)
+static void vmi_start_context_switch(struct task_struct *prev)
 {
-	paravirt_enter_lazy_cpu();
+	paravirt_start_context_switch(prev);
 	vmi_ops.set_lazy_mode(2);
 }
 
-static void vmi_leave_lazy_cpu(void)
+static void vmi_end_context_switch(struct task_struct *next)
 {
 	vmi_ops.set_lazy_mode(0);
-	paravirt_leave_lazy_cpu();
+	paravirt_end_context_switch(next);
 }
 
 static void vmi_enter_lazy_mmu(void)
@@ -717,9 +717,9 @@ static inline int __init activate_vmi(void)
 	para_fill(pv_cpu_ops.set_iopl_mask, SetIOPLMask);
 	para_fill(pv_cpu_ops.io_delay, IODelay);
 
-	para_wrap(pv_cpu_ops.lazy_mode.enter, vmi_enter_lazy_cpu,
+	para_wrap(pv_cpu_ops.start_context_switch, vmi_start_context_switch,
 		  set_lazy_mode, SetLazyMode);
-	para_wrap(pv_cpu_ops.lazy_mode.leave, vmi_leave_lazy_cpu,
+	para_wrap(pv_cpu_ops.end_context_switch, vmi_end_context_switch,
 		  set_lazy_mode, SetLazyMode);
 
 	para_wrap(pv_mmu_ops.lazy_mode.enter, vmi_enter_lazy_mmu,
diff --git a/arch/x86/lguest/boot.c b/arch/x86/lguest/boot.c
index 70d412a..25799f3 100644
--- a/arch/x86/lguest/boot.c
+++ b/arch/x86/lguest/boot.c
@@ -153,10 +153,10 @@ static void lguest_leave_lazy_mmu_mode(void)
 	paravirt_leave_lazy_mmu();
 }
 
-static void lguest_leave_lazy_cpu_mode(void)
+static void lguest_end_context_switch(struct task_struct *next)
 {
 	hcall(LHCALL_FLUSH_ASYNC, 0, 0, 0);
-	paravirt_leave_lazy_cpu();
+	paravirt_end_context_switch(next);
 }
 
 /*G:033
@@ -1031,8 +1031,8 @@ __init void lguest_init(void)
 	pv_cpu_ops.write_gdt_entry = lguest_write_gdt_entry;
 	pv_cpu_ops.write_idt_entry = lguest_write_idt_entry;
 	pv_cpu_ops.wbinvd = lguest_wbinvd;
-	pv_cpu_ops.lazy_mode.enter = paravirt_enter_lazy_cpu;
-	pv_cpu_ops.lazy_mode.leave = lguest_leave_lazy_cpu_mode;
+	pv_cpu_ops.start_context_switch = paravirt_start_context_switch;
+	pv_cpu_ops.end_context_switch = lguest_end_context_switch;
 
 	/* pagetable management */
 	pv_mmu_ops.write_cr3 = lguest_write_cr3;
diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index f586e63..70b355d 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -203,10 +203,10 @@ static unsigned long xen_get_debugreg(int reg)
 	return HYPERVISOR_get_debugreg(reg);
 }
 
-static void xen_leave_lazy_cpu(void)
+static void xen_end_context_switch(struct task_struct *next)
 {
 	xen_mc_flush();
-	paravirt_leave_lazy_cpu();
+	paravirt_end_context_switch(next);
 }
 
 static unsigned long xen_store_tr(void)
@@ -817,10 +817,8 @@ static const struct pv_cpu_ops xen_cpu_ops __initdata = {
 	/* Xen takes care of %gs when switching to usermode for us */
 	.swapgs = paravirt_nop,
 
-	.lazy_mode = {
-		.enter = paravirt_enter_lazy_cpu,
-		.leave = xen_leave_lazy_cpu,
-	},
+	.start_context_switch = paravirt_start_context_switch,
+	.end_context_switch = xen_end_context_switch,
 };
 
 static const struct pv_apic_ops xen_apic_ops __initdata = {
diff --git a/include/asm-frv/pgtable.h b/include/asm-frv/pgtable.h
index 235e34a..0988704 100644
--- a/include/asm-frv/pgtable.h
+++ b/include/asm-frv/pgtable.h
@@ -74,7 +74,7 @@ static inline int pte_file(pte_t pte) { return 0; }
 #define arch_enter_lazy_mmu_mode()	do {} while (0)
 #define arch_leave_lazy_mmu_mode()	do {} while (0)
 
-#define arch_start_context_switch()	do {} while (0)
+#define arch_start_context_switch(prev)	do {} while (0)
 
 #else /* !CONFIG_MMU */
 /*****************************************************************************/
diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index 922f036..e410f60 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -291,7 +291,7 @@ static inline void ptep_modify_prot_commit(struct mm_struct *mm,
  * definition.
  */
 #ifndef __HAVE_ARCH_START_CONTEXT_SWITCH
-#define arch_start_context_switch()	do {} while (0)
+#define arch_start_context_switch(prev)	do {} while (0)
 #endif
 
 #ifndef __HAVE_PFNMAP_TRACKING
diff --git a/kernel/sched.c b/kernel/sched.c
index 168adaf..77b43e3 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -2804,7 +2804,7 @@ context_switch(struct rq *rq, struct task_struct *prev,
 	 * combine the page table reload and the switch backend into
 	 * one hypercall.
 	 */
-	arch_start_context_switch();
+	arch_start_context_switch(prev);
 
 	if (unlikely(!mm)) {
 		next->active_mm = oldmm;
-- 
1.6.0.6


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 6/8] x86/paravirt: allow preemption with lazy mmu mode
  2009-03-27 18:02 [PATCH] Allow preemption during lazy mmu updates Jeremy Fitzhardinge
                   ` (4 preceding siblings ...)
  2009-03-27 18:02 ` [PATCH 5/8] x86/paravirt: finish change from lazy cpu to context switch start/end Jeremy Fitzhardinge
@ 2009-03-27 18:02 ` Jeremy Fitzhardinge
  2009-03-27 18:02 ` [PATCH 7/8] mm: allow preemption in apply_to_pte_range Jeremy Fitzhardinge
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 30+ messages in thread
From: Jeremy Fitzhardinge @ 2009-03-27 18:02 UTC (permalink / raw)
  To: Andrew Morton
  Cc: the arch/x86 maintainers, Ingo Molnar, Linux Kernel Mailing List,
	Peter Zijlstra, Nick Piggin, Thomas Gleixner, Jeremy Fitzhardinge

From: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

Impact: remove obsolete checks, simplification

Lift restrictions on preemption with lazy mmu mode, as it is now allowed.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
---
 arch/x86/kernel/paravirt.c |    7 ++++---
 arch/x86/xen/mmu.c         |    8 +-------
 2 files changed, 5 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
index 35eb353..c59a9d3 100644
--- a/arch/x86/kernel/paravirt.c
+++ b/arch/x86/kernel/paravirt.c
@@ -247,7 +247,6 @@ static DEFINE_PER_CPU(enum paravirt_lazy_mode, paravirt_lazy_mode) = PARAVIRT_LA
 static inline void enter_lazy(enum paravirt_lazy_mode mode)
 {
 	BUG_ON(__get_cpu_var(paravirt_lazy_mode) != PARAVIRT_LAZY_NONE);
-	BUG_ON(preemptible());
 
 	__get_cpu_var(paravirt_lazy_mode) = mode;
 }
@@ -255,7 +254,6 @@ static inline void enter_lazy(enum paravirt_lazy_mode mode)
 static void leave_lazy(enum paravirt_lazy_mode mode)
 {
 	BUG_ON(__get_cpu_var(paravirt_lazy_mode) != mode);
-	BUG_ON(preemptible());
 
 	__get_cpu_var(paravirt_lazy_mode) = PARAVIRT_LAZY_NONE;
 }
@@ -272,6 +270,8 @@ void paravirt_leave_lazy_mmu(void)
 
 void paravirt_start_context_switch(struct task_struct *prev)
 {
+	BUG_ON(preemptible());
+
 	if (percpu_read(paravirt_lazy_mode) == PARAVIRT_LAZY_MMU) {
 		arch_leave_lazy_mmu_mode();
 		set_ti_thread_flag(task_thread_info(prev), TIF_LAZY_MMU_UPDATES);
@@ -281,6 +281,8 @@ void paravirt_start_context_switch(struct task_struct *prev)
 
 void paravirt_end_context_switch(struct task_struct *next)
 {
+	BUG_ON(preemptible());
+
 	leave_lazy(PARAVIRT_LAZY_CPU);
 
 	if (test_and_clear_ti_thread_flag(task_thread_info(next), TIF_LAZY_MMU_UPDATES))
@@ -300,7 +302,6 @@ void arch_flush_lazy_mmu_mode(void)
 	preempt_disable();
 
 	if (paravirt_get_lazy_mode() == PARAVIRT_LAZY_MMU) {
-		WARN_ON(preempt_count() == 1);
 		arch_leave_lazy_mmu_mode();
 		arch_enter_lazy_mmu_mode();
 	}
diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
index aba3b20..e194f72 100644
--- a/arch/x86/xen/mmu.c
+++ b/arch/x86/xen/mmu.c
@@ -419,10 +419,6 @@ void set_pte_mfn(unsigned long vaddr, unsigned long mfn, pgprot_t flags)
 void xen_set_pte_at(struct mm_struct *mm, unsigned long addr,
 		    pte_t *ptep, pte_t pteval)
 {
-	/* updates to init_mm may be done without lock */
-	if (mm == &init_mm)
-		preempt_disable();
-
 	ADD_STATS(set_pte_at, 1);
 //	ADD_STATS(set_pte_at_pinned, xen_page_pinned(ptep));
 	ADD_STATS(set_pte_at_current, mm == current->mm);
@@ -443,9 +439,7 @@ void xen_set_pte_at(struct mm_struct *mm, unsigned long addr,
 	}
 	xen_set_pte(ptep, pteval);
 
-out:
-	if (mm == &init_mm)
-		preempt_enable();
+out:	return;
 }
 
 pte_t xen_ptep_modify_prot_start(struct mm_struct *mm,
-- 
1.6.0.6


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 7/8] mm: allow preemption in apply_to_pte_range
  2009-03-27 18:02 [PATCH] Allow preemption during lazy mmu updates Jeremy Fitzhardinge
                   ` (5 preceding siblings ...)
  2009-03-27 18:02 ` [PATCH 6/8] x86/paravirt: allow preemption with lazy mmu mode Jeremy Fitzhardinge
@ 2009-03-27 18:02 ` Jeremy Fitzhardinge
  2009-04-07 21:38   ` Andrew Morton
  2009-03-27 18:02 ` [PATCH 8/8] x86/paravirt: use percpu_ rather than __get_cpu_var Jeremy Fitzhardinge
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 30+ messages in thread
From: Jeremy Fitzhardinge @ 2009-03-27 18:02 UTC (permalink / raw)
  To: Andrew Morton
  Cc: the arch/x86 maintainers, Ingo Molnar, Linux Kernel Mailing List,
	Peter Zijlstra, Nick Piggin, Thomas Gleixner, Jeremy Fitzhardinge

From: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

Impact: allow preemption in apply_to_pte_range updates to init_mm

Preemption is now allowed for lazy mmu mode, so don't disable
it for the inner loop of apply_to_pte_range.  This only applies
when doing updates to init_mm; user pagetables are still modified
under the pte lock, so preemption is disabled anyway.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
---
 mm/memory.c |    2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index 27f8677..ef11ac6 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1722,7 +1722,6 @@ static int apply_to_pte_range(struct mm_struct *mm, pmd_t *pmd,
 
 	BUG_ON(pmd_huge(*pmd));
 
-	preempt_disable();
 	arch_enter_lazy_mmu_mode();
 
 	token = pmd_pgtable(*pmd);
@@ -1734,7 +1733,6 @@ static int apply_to_pte_range(struct mm_struct *mm, pmd_t *pmd,
 	} while (pte++, addr += PAGE_SIZE, addr != end);
 
 	arch_leave_lazy_mmu_mode();
-	preempt_enable();
 
 	if (mm != &init_mm)
 		pte_unmap_unlock(pte-1, ptl);
-- 
1.6.0.6


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 8/8] x86/paravirt: use percpu_ rather than __get_cpu_var
  2009-03-27 18:02 [PATCH] Allow preemption during lazy mmu updates Jeremy Fitzhardinge
                   ` (6 preceding siblings ...)
  2009-03-27 18:02 ` [PATCH 7/8] mm: allow preemption in apply_to_pte_range Jeremy Fitzhardinge
@ 2009-03-27 18:02 ` Jeremy Fitzhardinge
  2009-03-27 23:48 ` [PATCH] Allow preemption during lazy mmu updates Peter Zijlstra
  2009-04-08 14:54 ` Ingo Molnar
  9 siblings, 0 replies; 30+ messages in thread
From: Jeremy Fitzhardinge @ 2009-03-27 18:02 UTC (permalink / raw)
  To: Andrew Morton
  Cc: the arch/x86 maintainers, Ingo Molnar, Linux Kernel Mailing List,
	Peter Zijlstra, Nick Piggin, Thomas Gleixner, Jeremy Fitzhardinge

From: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

Impact: minor optimisation

percpu_read/write is a slightly more direct way of getting
to percpu data.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
---
 arch/x86/kernel/paravirt.c |   10 +++++-----
 1 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
index c59a9d3..aa34423 100644
--- a/arch/x86/kernel/paravirt.c
+++ b/arch/x86/kernel/paravirt.c
@@ -246,16 +246,16 @@ static DEFINE_PER_CPU(enum paravirt_lazy_mode, paravirt_lazy_mode) = PARAVIRT_LA
 
 static inline void enter_lazy(enum paravirt_lazy_mode mode)
 {
-	BUG_ON(__get_cpu_var(paravirt_lazy_mode) != PARAVIRT_LAZY_NONE);
+	BUG_ON(percpu_read(paravirt_lazy_mode) != PARAVIRT_LAZY_NONE);
 
-	__get_cpu_var(paravirt_lazy_mode) = mode;
+	percpu_write(paravirt_lazy_mode, mode);
 }
 
 static void leave_lazy(enum paravirt_lazy_mode mode)
 {
-	BUG_ON(__get_cpu_var(paravirt_lazy_mode) != mode);
+	BUG_ON(percpu_read(paravirt_lazy_mode) != mode);
 
-	__get_cpu_var(paravirt_lazy_mode) = PARAVIRT_LAZY_NONE;
+	percpu_write(paravirt_lazy_mode, PARAVIRT_LAZY_NONE);
 }
 
 void paravirt_enter_lazy_mmu(void)
@@ -294,7 +294,7 @@ enum paravirt_lazy_mode paravirt_get_lazy_mode(void)
 	if (in_interrupt())
 		return PARAVIRT_LAZY_NONE;
 
-	return __get_cpu_var(paravirt_lazy_mode);
+	return percpu_read(paravirt_lazy_mode);
 }
 
 void arch_flush_lazy_mmu_mode(void)
-- 
1.6.0.6


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH] Allow preemption during lazy mmu updates
  2009-03-27 18:02 [PATCH] Allow preemption during lazy mmu updates Jeremy Fitzhardinge
                   ` (7 preceding siblings ...)
  2009-03-27 18:02 ` [PATCH 8/8] x86/paravirt: use percpu_ rather than __get_cpu_var Jeremy Fitzhardinge
@ 2009-03-27 23:48 ` Peter Zijlstra
  2009-04-08 14:54 ` Ingo Molnar
  9 siblings, 0 replies; 30+ messages in thread
From: Peter Zijlstra @ 2009-03-27 23:48 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Andrew Morton, the arch/x86 maintainers, Ingo Molnar,
	Linux Kernel Mailing List, Nick Piggin, Thomas Gleixner

On Fri, 2009-03-27 at 11:02 -0700, Jeremy Fitzhardinge wrote:
> Hi all,
> 
> We discussed this series a while ago.  The specific problem
> was the need to disable preemption in apply_to_pte_range when using
> lazy mmu updates around the callback function.  When used on usermode
> addresses there was no problem because it needs to take the pte
> lock anyway, but there's no requirement for taking a pte lock when
> updating kernel ptes, so it ended up adding a new no-preempt region.
> 
> The gist of the series is that if we get preempted while doing an mmu
> update, we flush all the pending updates and switch to the next task.
> We record that the task was doing a lazy mmu update in its task flags,
> and resume lazy updates when we switch back.
> 
> All the context-switch time activity happens in the existing
> context-switch pvops calls, so there's no cost to non-pvops systems,
> or to pvops backends which don't use lazy mmu updates.
> 
> I don't think there were any objections to this series, but Ingo would
> like to see an Acked by from someone since it gets into the mm side
> of things.
> 
> (The first patch in the series adds the required preempt disable/enable
> and then the rest of the series removes them again.  I think the first
> patch is already in mm-.)

Looks good from my POV

Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 7/8] mm: allow preemption in apply_to_pte_range
  2009-03-27 18:02 ` [PATCH 7/8] mm: allow preemption in apply_to_pte_range Jeremy Fitzhardinge
@ 2009-04-07 21:38   ` Andrew Morton
  2009-04-07 21:54     ` Jeremy Fitzhardinge
  0 siblings, 1 reply; 30+ messages in thread
From: Andrew Morton @ 2009-04-07 21:38 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: x86, mingo, linux-kernel, a.p.zijlstra, nickpiggin, tglx,
	jeremy.fitzhardinge

On Fri, 27 Mar 2009 11:02:42 -0700
Jeremy Fitzhardinge <jeremy@goop.org> wrote:

> From: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
> 
> Impact: allow preemption in apply_to_pte_range updates to init_mm
> 
> Preemption is now allowed for lazy mmu mode, so don't disable
> it for the inner loop of apply_to_pte_range.  This only applies
> when doing updates to init_mm; user pagetables are still modified
> under the pte lock, so preemption is disabled anyway.
> 
> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
> ---
>  mm/memory.c |    2 --
>  1 files changed, 0 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index 27f8677..ef11ac6 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -1722,7 +1722,6 @@ static int apply_to_pte_range(struct mm_struct *mm, pmd_t *pmd,
>  
>  	BUG_ON(pmd_huge(*pmd));
>  
> -	preempt_disable();
>  	arch_enter_lazy_mmu_mode();
>  
>  	token = pmd_pgtable(*pmd);
> @@ -1734,7 +1733,6 @@ static int apply_to_pte_range(struct mm_struct *mm, pmd_t *pmd,
>  	} while (pte++, addr += PAGE_SIZE, addr != end);
>  
>  	arch_leave_lazy_mmu_mode();
> -	preempt_enable();
>  
>  	if (mm != &init_mm)
>  		pte_unmap_unlock(pte-1, ptl);

So across the aptch series the aggregate change to mm/ is nil, and this
is wholly an x86 patch series?


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 7/8] mm: allow preemption in apply_to_pte_range
  2009-04-07 21:38   ` Andrew Morton
@ 2009-04-07 21:54     ` Jeremy Fitzhardinge
  0 siblings, 0 replies; 30+ messages in thread
From: Jeremy Fitzhardinge @ 2009-04-07 21:54 UTC (permalink / raw)
  To: Andrew Morton
  Cc: x86, mingo, linux-kernel, a.p.zijlstra, nickpiggin, tglx,
	jeremy.fitzhardinge

Andrew Morton wrote:
> So across the aptch series the aggregate change to mm/ is nil, and this
> is wholly an x86 patch series?
>
>   

Functionally, yes.  I also changed the name of the 
arch_enter_lazy_cpu_mode() -> arch_start_context_switch() which affects 
kernel/sched.c:context_switch(), but it is just a name update (which 
also required an asm-frv update, which defines its own no-op version of 
the macro rather than using the default no-op one, for some reason).

    J

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Allow preemption during lazy mmu updates
  2009-03-27 18:02 [PATCH] Allow preemption during lazy mmu updates Jeremy Fitzhardinge
                   ` (8 preceding siblings ...)
  2009-03-27 23:48 ` [PATCH] Allow preemption during lazy mmu updates Peter Zijlstra
@ 2009-04-08 14:54 ` Ingo Molnar
  2009-04-08 15:11   ` Peter Zijlstra
                     ` (2 more replies)
  9 siblings, 3 replies; 30+ messages in thread
From: Ingo Molnar @ 2009-04-08 14:54 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Andrew Morton, the arch/x86 maintainers,
	Linux Kernel Mailing List, Peter Zijlstra, Nick Piggin,
	Thomas Gleixner


* Jeremy Fitzhardinge <jeremy@goop.org> wrote:

>  include/asm-frv/pgtable.h          |    4 +-

Needs the ack of the FRV arch maintainer both for content and for 
flow (i.e. via x86 tree). If any second thoughts are expressed about 
the flow then this needs to go on separate tracks.

>  include/asm-generic/pgtable.h      |   21 +++++++------

Needs the ack of arch maintainers in general and a linux-arch 
cross-post.

>  kernel/sched.c                     |    2 -

Needs the ack of ... oh, never mind - this one is fine i guess ;-)

	Ingo

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Allow preemption during lazy mmu updates
  2009-04-08 14:54 ` Ingo Molnar
@ 2009-04-08 15:11   ` Peter Zijlstra
  2009-04-19 10:15     ` Avi Kivity
  2009-04-08 17:32   ` Jeremy Fitzhardinge
  2009-04-08 18:10   ` David Howells
  2 siblings, 1 reply; 30+ messages in thread
From: Peter Zijlstra @ 2009-04-08 15:11 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Jeremy Fitzhardinge, Andrew Morton, the arch/x86 maintainers,
	Linux Kernel Mailing List, Nick Piggin, Thomas Gleixner,
	Avi Kivity

On Wed, 2009-04-08 at 16:54 +0200, Ingo Molnar wrote:
> 
> >  kernel/sched.c                     |    2 -
> 
> Needs the ack of ... oh, never mind - this one is fine i guess ;-)

Ah, about that. This new preemption hook has slightly different
requirements than the current preempt-notifiers have (hence the new
hook), I was wondering if KVM (afaik currently the only preempt-notifier
consumer) could live with these requirements.

That is, could these be merged?


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Allow preemption during lazy mmu updates
  2009-04-08 14:54 ` Ingo Molnar
  2009-04-08 15:11   ` Peter Zijlstra
@ 2009-04-08 17:32   ` Jeremy Fitzhardinge
  2009-04-08 18:10   ` David Howells
  2 siblings, 0 replies; 30+ messages in thread
From: Jeremy Fitzhardinge @ 2009-04-08 17:32 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Andrew Morton, the arch/x86 maintainers,
	Linux Kernel Mailing List, Peter Zijlstra, Nick Piggin,
	Thomas Gleixner, David Howells, Yoshinori Sato

Ingo Molnar wrote:
> * Jeremy Fitzhardinge <jeremy@goop.org> wrote:
>
>   
>>  include/asm-frv/pgtable.h          |    4 +-
>>     
>
> Needs the ack of the FRV arch maintainer both for content and for 
> flow (i.e. via x86 tree). If any second thoughts are expressed about 
> the flow then this needs to go on separate tracks.
>   

I don't know why frv defines this; its just cut'n'paste from the default 
no-op implementation in asm-generic/pgtable.h.  (h8300 too, it seems.)

David, do you have a specific reason for defining 
arch_enter/leave_lazy_cpu_mode() in asm-frv/pgtable.h?  It seems to have 
come in with 28936117af849b8c2fca664a41ea7651a0d99591 "FRV: Add some 
missng lazy MMU hooks for NOMMU mode".  The intention was that 
asm-generic/pgtable.h should supply the default definitions; is that 
incompatible with nommu or something?

Yoshinori-san, do you have a specific reason for defining 
arch_enter/leave_lazy_cpu_mode() in asm-h3800/pgtable.h?  It seems to 
have come in with c728d60455e8e8722ee08312a75f38dd7a866b5e "h8300 
generic irq", which doesn't seem like a related change.

>>  include/asm-generic/pgtable.h      |   21 +++++++------
>>     
>
> Needs the ack of arch maintainers in general and a linux-arch 
> cross-post.
>   

asm-generic/pgtable.h defines the default no-op implementation which is 
used when the architecture has no particular use for the hook.  The only 
non-x86 definitions are frv and h3800, and they're both copies of the 
no-op definition.

In any case, if this series is a sticking point, we can easily drop it 
as the subsequent patches have no dependency on it.

    J

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Allow preemption during lazy mmu updates
  2009-04-08 14:54 ` Ingo Molnar
  2009-04-08 15:11   ` Peter Zijlstra
  2009-04-08 17:32   ` Jeremy Fitzhardinge
@ 2009-04-08 18:10   ` David Howells
  2009-04-08 18:30     ` [PATCH] FRV: Use <asm-generic/pgtable.h> in NOMMU mode David Howells
  2 siblings, 1 reply; 30+ messages in thread
From: David Howells @ 2009-04-08 18:10 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: dhowells, Ingo Molnar, Andrew Morton, the arch/x86 maintainers,
	Linux Kernel Mailing List, Peter Zijlstra, Nick Piggin,
	Thomas Gleixner, Yoshinori Sato

Jeremy Fitzhardinge <jeremy@goop.org> wrote:

> David, do you have a specific reason for defining
> arch_enter/leave_lazy_cpu_mode() in asm-frv/pgtable.h?  It seems to have come
> in with 28936117af849b8c2fca664a41ea7651a0d99591 "FRV: Add some missng lazy
> MMU hooks for NOMMU mode".  The intention was that asm-generic/pgtable.h
> should supply the default definitions; is that incompatible with nommu or
> something?

I don't remember.  It was over two years ago.  My guess is that it was to get
things to compile.  Note that in NOMMU mode, <asm-generic/pgtable.h> is _not_
#included by <asm-frv/pgtable.h>.

David

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH] FRV: Use <asm-generic/pgtable.h> in NOMMU mode
  2009-04-08 18:10   ` David Howells
@ 2009-04-08 18:30     ` David Howells
  2009-04-08 18:35       ` Jeremy Fitzhardinge
                         ` (3 more replies)
  0 siblings, 4 replies; 30+ messages in thread
From: David Howells @ 2009-04-08 18:30 UTC (permalink / raw)
  To: jeremy, mingo, akpm
  Cc: x86, linux-kernel, a.p.zijlstra, nickpiggin, tglx, dhowells,
	ysato

asm-frv/pgtable.h could just #include <asm-generic/pgtable.h> in NOMMU mode
rather than #defining macros for lazy MMU and CPU stuff.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 include/asm-frv/pgtable.h |    6 ++----
 1 files changed, 2 insertions(+), 4 deletions(-)


diff --git a/include/asm-frv/pgtable.h b/include/asm-frv/pgtable.h
index e16fdb1..3323301 100644
--- a/include/asm-frv/pgtable.h
+++ b/include/asm-frv/pgtable.h
@@ -71,10 +71,8 @@ static inline int pte_file(pte_t pte) { return 0; }
 #define swapper_pg_dir		((pgd_t *) NULL)
 
 #define pgtable_cache_init()		do {} while (0)
-#define arch_enter_lazy_mmu_mode()	do {} while (0)
-#define arch_leave_lazy_mmu_mode()	do {} while (0)
-#define arch_enter_lazy_cpu_mode()	do {} while (0)
-#define arch_leave_lazy_cpu_mode()	do {} while (0)
+
+#include <asm-generic/pgtable.h>
 
 #else /* !CONFIG_MMU */
 /*****************************************************************************/


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH] FRV: Use <asm-generic/pgtable.h> in NOMMU mode
  2009-04-08 18:30     ` [PATCH] FRV: Use <asm-generic/pgtable.h> in NOMMU mode David Howells
@ 2009-04-08 18:35       ` Jeremy Fitzhardinge
  2009-04-08 18:44       ` Sam Ravnborg
                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 30+ messages in thread
From: Jeremy Fitzhardinge @ 2009-04-08 18:35 UTC (permalink / raw)
  To: David Howells
  Cc: mingo, akpm, x86, linux-kernel, a.p.zijlstra, nickpiggin, tglx,
	ysato

David Howells wrote:
> asm-frv/pgtable.h could just #include <asm-generic/pgtable.h> in NOMMU mode
> rather than #defining macros for lazy MMU and CPU stuff.
>   

OK, thanks.  Do you mind this patch going via tip.git?

    J
> Signed-off-by: David Howells <dhowells@redhat.com>
> ---
>
>  include/asm-frv/pgtable.h |    6 ++----
>  1 files changed, 2 insertions(+), 4 deletions(-)
>
>
> diff --git a/include/asm-frv/pgtable.h b/include/asm-frv/pgtable.h
> index e16fdb1..3323301 100644
> --- a/include/asm-frv/pgtable.h
> +++ b/include/asm-frv/pgtable.h
> @@ -71,10 +71,8 @@ static inline int pte_file(pte_t pte) { return 0; }
>  #define swapper_pg_dir		((pgd_t *) NULL)
>  
>  #define pgtable_cache_init()		do {} while (0)
> -#define arch_enter_lazy_mmu_mode()	do {} while (0)
> -#define arch_leave_lazy_mmu_mode()	do {} while (0)
> -#define arch_enter_lazy_cpu_mode()	do {} while (0)
> -#define arch_leave_lazy_cpu_mode()	do {} while (0)
> +
> +#include <asm-generic/pgtable.h>
>  
>  #else /* !CONFIG_MMU */
>  /*****************************************************************************/
>
>   


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] FRV: Use <asm-generic/pgtable.h> in NOMMU mode
  2009-04-08 18:30     ` [PATCH] FRV: Use <asm-generic/pgtable.h> in NOMMU mode David Howells
  2009-04-08 18:35       ` Jeremy Fitzhardinge
@ 2009-04-08 18:44       ` Sam Ravnborg
  2009-04-08 18:47       ` David Howells
  2009-04-08 18:47       ` David Howells
  3 siblings, 0 replies; 30+ messages in thread
From: Sam Ravnborg @ 2009-04-08 18:44 UTC (permalink / raw)
  To: David Howells
  Cc: jeremy, mingo, akpm, x86, linux-kernel, a.p.zijlstra, nickpiggin,
	tglx, ysato

On Wed, Apr 08, 2009 at 07:30:11PM +0100, David Howells wrote:
> asm-frv/pgtable.h could just #include <asm-generic/pgtable.h> in NOMMU mode
> rather than #defining macros for lazy MMU and CPU stuff.
> 
> Signed-off-by: David Howells <dhowells@redhat.com>
> ---
> 
>  include/asm-frv/pgtable.h |    6 ++----
>  1 files changed, 2 insertions(+), 4 deletions(-)
> 
> 
> diff --git a/include/asm-frv/pgtable.h b/include/asm-frv/pgtable.h
> index e16fdb1..3323301 100644
> --- a/include/asm-frv/pgtable.h
> +++ b/include/asm-frv/pgtable.h

Any change we can have these header files moved before too
many patches are queued?

	Sam

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] FRV: Use <asm-generic/pgtable.h> in NOMMU mode
  2009-04-08 18:30     ` [PATCH] FRV: Use <asm-generic/pgtable.h> in NOMMU mode David Howells
  2009-04-08 18:35       ` Jeremy Fitzhardinge
  2009-04-08 18:44       ` Sam Ravnborg
@ 2009-04-08 18:47       ` David Howells
  2009-04-08 18:47       ` David Howells
  3 siblings, 0 replies; 30+ messages in thread
From: David Howells @ 2009-04-08 18:47 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: dhowells, mingo, akpm, x86, linux-kernel, a.p.zijlstra,
	nickpiggin, tglx, ysato

Jeremy Fitzhardinge <jeremy@goop.org> wrote:

> OK, thanks.  Do you mind this patch going via tip.git?

No.

David

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] FRV: Use <asm-generic/pgtable.h> in NOMMU mode
  2009-04-08 18:30     ` [PATCH] FRV: Use <asm-generic/pgtable.h> in NOMMU mode David Howells
                         ` (2 preceding siblings ...)
  2009-04-08 18:47       ` David Howells
@ 2009-04-08 18:47       ` David Howells
  2009-04-08 20:56         ` Sam Ravnborg
  2009-04-08 21:25         ` David Howells
  3 siblings, 2 replies; 30+ messages in thread
From: David Howells @ 2009-04-08 18:47 UTC (permalink / raw)
  To: Sam Ravnborg
  Cc: dhowells, jeremy, mingo, akpm, x86, linux-kernel, a.p.zijlstra,
	nickpiggin, tglx, ysato

Sam Ravnborg <sam@ravnborg.org> wrote:

> Any change we can have these header files moved before too
> many patches are queued?

Sure.  Do you have patches to move FRV and MN10300 headers?

David

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] FRV: Use <asm-generic/pgtable.h> in NOMMU mode
  2009-04-08 18:47       ` David Howells
@ 2009-04-08 20:56         ` Sam Ravnborg
  2009-04-08 21:25         ` David Howells
  1 sibling, 0 replies; 30+ messages in thread
From: Sam Ravnborg @ 2009-04-08 20:56 UTC (permalink / raw)
  To: David Howells
  Cc: jeremy, mingo, akpm, x86, linux-kernel, a.p.zijlstra, nickpiggin,
	tglx, ysato

On Wed, Apr 08, 2009 at 07:47:57PM +0100, David Howells wrote:
> Sam Ravnborg <sam@ravnborg.org> wrote:
> 
> > Any change we can have these header files moved before too
> > many patches are queued?
> 
> Sure.  Do you have patches to move FRV and MN10300 headers?

For frv it is as simple as:

mkdir -p arch/frv/include/asm
git mv include/asm-frv/* arch/frv/include/asm
git commit

For mn10300 Al Viro sent a set of patches some months ago.
Google turned up the following:
http://lkml.indiana.edu/hypermail/linux/kernel/0812.2/00826.html
http://lkml.indiana.edu/hypermail/linux/kernel/0812.2/00827.html

I did not try - but I assume they would apply almost clean today.

	Sam

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] FRV: Use <asm-generic/pgtable.h> in NOMMU mode
  2009-04-08 18:47       ` David Howells
  2009-04-08 20:56         ` Sam Ravnborg
@ 2009-04-08 21:25         ` David Howells
  2009-04-08 21:40           ` Sam Ravnborg
  2009-04-08 21:48           ` David Howells
  1 sibling, 2 replies; 30+ messages in thread
From: David Howells @ 2009-04-08 21:25 UTC (permalink / raw)
  To: Sam Ravnborg
  Cc: dhowells, jeremy, mingo, akpm, x86, linux-kernel, a.p.zijlstra,
	nickpiggin, tglx, ysato

Sam Ravnborg <sam@ravnborg.org> wrote:

> > > Any change we can have these header files moved before too
> > > many patches are queued?
> > 
> > Sure.  Do you have patches to move FRV and MN10300 headers?
> 
> For frv it is as simple as:
> 
> mkdir -p arch/frv/include/asm
> git mv include/asm-frv/* arch/frv/include/asm
> git commit

The thing I really detest about doing this is this:

	warthog>git-log -M arch/frv/include/asm/pgtable.h | cat
	commit 2203e07b4ba1d1113ab80e7c51062f6994a62d3a
	Author: David Howells <dhowells@redhat.com>
	Date:   Wed Apr 8 22:20:14 2009 +0100

	    FRV: Move to arch/frv/include/asm/

	    Move from include/asm-frv/ to arch/frv/include/asm/.

	    Signed-off-by: David Howells <dhowells@redhat.com>
	warthog>

So much for the old history:

	warthog>git-log include/asm-frv/pgtable.h | grep ^commit | wc -l
	20
	warthog>

David

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] FRV: Use <asm-generic/pgtable.h> in NOMMU mode
  2009-04-08 21:25         ` David Howells
@ 2009-04-08 21:40           ` Sam Ravnborg
  2009-04-08 21:48           ` David Howells
  1 sibling, 0 replies; 30+ messages in thread
From: Sam Ravnborg @ 2009-04-08 21:40 UTC (permalink / raw)
  To: David Howells
  Cc: jeremy, mingo, akpm, x86, linux-kernel, a.p.zijlstra, nickpiggin,
	tglx, ysato

On Wed, Apr 08, 2009 at 10:25:55PM +0100, David Howells wrote:
> Sam Ravnborg <sam@ravnborg.org> wrote:
> 
> > > > Any change we can have these header files moved before too
> > > > many patches are queued?
> > > 
> > > Sure.  Do you have patches to move FRV and MN10300 headers?
> > 
> > For frv it is as simple as:
> > 
> > mkdir -p arch/frv/include/asm
> > git mv include/asm-frv/* arch/frv/include/asm
> > git commit
> 
> The thing I really detest about doing this is this:
> 
> 	warthog>git-log -M arch/frv/include/asm/pgtable.h | cat
> 	commit 2203e07b4ba1d1113ab80e7c51062f6994a62d3a
> 	Author: David Howells <dhowells@redhat.com>
> 	Date:   Wed Apr 8 22:20:14 2009 +0100
> 
> 	    FRV: Move to arch/frv/include/asm/
> 
> 	    Move from include/asm-frv/ to arch/frv/include/asm/.
> 
> 	    Signed-off-by: David Howells <dhowells@redhat.com>
> 	warthog>
> 
> So much for the old history:
> 
> 	warthog>git-log include/asm-frv/pgtable.h | grep ^commit | wc -l
> 	20
> 	warthog>

--follow is needed to follow renames when doing git log.

It is not that the history is lost - but hidden a bit.

	Sam

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] FRV: Use <asm-generic/pgtable.h> in NOMMU mode
  2009-04-08 21:25         ` David Howells
  2009-04-08 21:40           ` Sam Ravnborg
@ 2009-04-08 21:48           ` David Howells
  2009-04-08 22:04             ` Sam Ravnborg
  1 sibling, 1 reply; 30+ messages in thread
From: David Howells @ 2009-04-08 21:48 UTC (permalink / raw)
  To: Sam Ravnborg
  Cc: dhowells, jeremy, mingo, akpm, x86, linux-kernel, a.p.zijlstra,
	nickpiggin, tglx, ysato

Sam Ravnborg <sam@ravnborg.org> wrote:

> --follow is needed to follow renames when doing git log.

So what's '-M' for?

David

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] FRV: Use <asm-generic/pgtable.h> in NOMMU mode
  2009-04-08 21:48           ` David Howells
@ 2009-04-08 22:04             ` Sam Ravnborg
  0 siblings, 0 replies; 30+ messages in thread
From: Sam Ravnborg @ 2009-04-08 22:04 UTC (permalink / raw)
  To: David Howells
  Cc: jeremy, mingo, akpm, x86, linux-kernel, a.p.zijlstra, nickpiggin,
	tglx, ysato

On Wed, Apr 08, 2009 at 10:48:57PM +0100, David Howells wrote:
> Sam Ravnborg <sam@ravnborg.org> wrote:
> 
> > --follow is needed to follow renames when doing git log.
> 
> So what's '-M' for?

'-M' is about creating diffs.
If you need more details you need to ask on the git-list.

I only know how to use git - not so much why.

	Sam

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Allow preemption during lazy mmu updates
  2009-04-08 15:11   ` Peter Zijlstra
@ 2009-04-19 10:15     ` Avi Kivity
  2009-04-19 10:47       ` Peter Zijlstra
  2009-04-19 23:53       ` Jeremy Fitzhardinge
  0 siblings, 2 replies; 30+ messages in thread
From: Avi Kivity @ 2009-04-19 10:15 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Jeremy Fitzhardinge, Andrew Morton,
	the arch/x86 maintainers, Linux Kernel Mailing List, Nick Piggin,
	Thomas Gleixner

Peter Zijlstra wrote:
> On Wed, 2009-04-08 at 16:54 +0200, Ingo Molnar wrote:
>   
>>>  kernel/sched.c                     |    2 -
>>>       
>> Needs the ack of ... oh, never mind - this one is fine i guess ;-)
>>     
>
> Ah, about that. This new preemption hook has slightly different
> requirements than the current preempt-notifiers have (hence the new
> hook), I was wondering if KVM (afaik currently the only preempt-notifier
> consumer) could live with these requirements.
>
> That is, could these be merged?
>   

What are the slight differences in requirements?

KVM wants to run in non-preemptible, interrupts-enabled context.


-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Allow preemption during lazy mmu updates
  2009-04-19 10:15     ` Avi Kivity
@ 2009-04-19 10:47       ` Peter Zijlstra
  2009-04-19 23:53       ` Jeremy Fitzhardinge
  1 sibling, 0 replies; 30+ messages in thread
From: Peter Zijlstra @ 2009-04-19 10:47 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Ingo Molnar, Jeremy Fitzhardinge, Andrew Morton,
	the arch/x86 maintainers, Linux Kernel Mailing List, Nick Piggin,
	Thomas Gleixner

On Sun, 2009-04-19 at 13:15 +0300, Avi Kivity wrote:
> Peter Zijlstra wrote:
> > On Wed, 2009-04-08 at 16:54 +0200, Ingo Molnar wrote:
> >   
> >>>  kernel/sched.c                     |    2 -
> >>>       
> >> Needs the ack of ... oh, never mind - this one is fine i guess ;-)
> >>     
> >
> > Ah, about that. This new preemption hook has slightly different
> > requirements than the current preempt-notifiers have (hence the new
> > hook), I was wondering if KVM (afaik currently the only preempt-notifier
> > consumer) could live with these requirements.
> >
> > That is, could these be merged?
> >   
> 
> What are the slight differences in requirements?
> 
> KVM wants to run in non-preemptible, interrupts-enabled context.

The fire_sched_out bit is a little earlier, but I don't think that is a
particularly worrysome, but the most important difference was that
fire_sched_in in far too late. arch_end_context_switch() is done right
in the middle of switch_to() because it needs the TS bit or somesuch.

I'll let Jeremy explain details, as I've long since forgotten them :-)


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Allow preemption during lazy mmu updates
  2009-04-19 10:15     ` Avi Kivity
  2009-04-19 10:47       ` Peter Zijlstra
@ 2009-04-19 23:53       ` Jeremy Fitzhardinge
  2009-04-20  6:02         ` Avi Kivity
  1 sibling, 1 reply; 30+ messages in thread
From: Jeremy Fitzhardinge @ 2009-04-19 23:53 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Peter Zijlstra, Ingo Molnar, Andrew Morton,
	the arch/x86 maintainers, Linux Kernel Mailing List, Nick Piggin,
	Thomas Gleixner

Avi Kivity wrote:
> What are the slight differences in requirements?
>
> KVM wants to run in non-preemptible, interrupts-enabled context.

There are two hooks: arch_start_context_switch() in 
kernel/sched:context_switch(), and arch_end_context_switch() in 
arch/x86/kernel/process_(32|64).c.  They bound the heart of the context 
switch in which various bits of core state is changes, like the cr3 
reload, fpu TS flag, iopl, tls slots, etc.  All things which require a 
hypercall in a paravirtualized environment, and so can be batched 
together into a multicall (or whatever) to minimize the number of 
context-switch time hypercalls.  The placement of end_context_switch is 
particularly sensitive because it needs to be in the right place 
relative to fpu context reload and segment register reloading.

Preemption is definitely disabled, and interrupts as well, I think.  So 
perhaps these won't work for you.

However, looking at the fire_sched_out_preempt_notifiers() is almost in 
the same position as arch_start_context_switch(), so I think they could 
be unified one way or the other.  The sched_in notifier happens way too 
late though.  Does KVM just use both in and out, or just one?

    J

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Allow preemption during lazy mmu updates
  2009-04-19 23:53       ` Jeremy Fitzhardinge
@ 2009-04-20  6:02         ` Avi Kivity
  0 siblings, 0 replies; 30+ messages in thread
From: Avi Kivity @ 2009-04-20  6:02 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Peter Zijlstra, Ingo Molnar, Andrew Morton,
	the arch/x86 maintainers, Linux Kernel Mailing List, Nick Piggin,
	Thomas Gleixner

Jeremy Fitzhardinge wrote:
> Avi Kivity wrote:
>> What are the slight differences in requirements?
>>
>> KVM wants to run in non-preemptible, interrupts-enabled context.
>
> There are two hooks: arch_start_context_switch() in 
> kernel/sched:context_switch(), and arch_end_context_switch() in 
> arch/x86/kernel/process_(32|64).c.  They bound the heart of the 
> context switch in which various bits of core state is changes, like 
> the cr3 reload, fpu TS flag, iopl, tls slots, etc.  All things which 
> require a hypercall in a paravirtualized environment, and so can be 
> batched together into a multicall (or whatever) to minimize the number 
> of context-switch time hypercalls.  The placement of 
> end_context_switch is particularly sensitive because it needs to be in 
> the right place relative to fpu context reload and segment register 
> reloading.
>
> Preemption is definitely disabled, and interrupts as well, I think.  
> So perhaps these won't work for you.
>
> However, looking at the fire_sched_out_preempt_notifiers() is almost 
> in the same position as arch_start_context_switch(), so I think they 
> could be unified one way or the other.  The sched_in notifier happens 
> way too late though.  Does KVM just use both in and out, or just one?

Both.  sched_out loads the host MSR_STAR and friends, sched_in loads the 
guest values.


-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2009-04-20  6:05 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-03-27 18:02 [PATCH] Allow preemption during lazy mmu updates Jeremy Fitzhardinge
2009-03-27 18:02 ` [PATCH 1/8] mm: disable preemption in apply_to_pte_range Jeremy Fitzhardinge
2009-03-27 18:02 ` [PATCH 2/8] x86/paravirt: remove lazy mode in interrupts Jeremy Fitzhardinge
2009-03-27 18:02 ` [PATCH 3/8] x86/pvops: replace arch_enter_lazy_cpu_mode with arch_start_context_switch Jeremy Fitzhardinge
2009-03-27 18:02 ` [PATCH 4/8] x86/paravirt: flush pending mmu updates on context switch Jeremy Fitzhardinge
2009-03-27 18:02 ` [PATCH 5/8] x86/paravirt: finish change from lazy cpu to context switch start/end Jeremy Fitzhardinge
2009-03-27 18:02 ` [PATCH 6/8] x86/paravirt: allow preemption with lazy mmu mode Jeremy Fitzhardinge
2009-03-27 18:02 ` [PATCH 7/8] mm: allow preemption in apply_to_pte_range Jeremy Fitzhardinge
2009-04-07 21:38   ` Andrew Morton
2009-04-07 21:54     ` Jeremy Fitzhardinge
2009-03-27 18:02 ` [PATCH 8/8] x86/paravirt: use percpu_ rather than __get_cpu_var Jeremy Fitzhardinge
2009-03-27 23:48 ` [PATCH] Allow preemption during lazy mmu updates Peter Zijlstra
2009-04-08 14:54 ` Ingo Molnar
2009-04-08 15:11   ` Peter Zijlstra
2009-04-19 10:15     ` Avi Kivity
2009-04-19 10:47       ` Peter Zijlstra
2009-04-19 23:53       ` Jeremy Fitzhardinge
2009-04-20  6:02         ` Avi Kivity
2009-04-08 17:32   ` Jeremy Fitzhardinge
2009-04-08 18:10   ` David Howells
2009-04-08 18:30     ` [PATCH] FRV: Use <asm-generic/pgtable.h> in NOMMU mode David Howells
2009-04-08 18:35       ` Jeremy Fitzhardinge
2009-04-08 18:44       ` Sam Ravnborg
2009-04-08 18:47       ` David Howells
2009-04-08 18:47       ` David Howells
2009-04-08 20:56         ` Sam Ravnborg
2009-04-08 21:25         ` David Howells
2009-04-08 21:40           ` Sam Ravnborg
2009-04-08 21:48           ` David Howells
2009-04-08 22:04             ` Sam Ravnborg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).