linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Christoph Lameter <cl@linux.com>,
	akpm@linux-foundation.org, Pekka Enberg <penberg@cs.helsinki.fi>,
	Catalin Marinas <catalin.marinas@arm.com>,
	linux-kernel@vger.kernel.org,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Tejun Heo <tj@kernel.org>,
	linux-mm@kvack.org
Subject: Re: [thisops uV3 07/18] highmem: Use this_cpu_xx_return() operations
Date: Tue, 30 Nov 2010 20:19:42 +0100	[thread overview]
Message-ID: <1291144782.32004.1146.camel@laptop> (raw)
In-Reply-To: <1291144408.2904.232.camel@edumazet-laptop>

On Tue, 2010-11-30 at 20:13 +0100, Eric Dumazet wrote:
> Le mardi 30 novembre 2010 à 13:07 -0600, Christoph Lameter a écrit :
> > pièce jointe document texte brut (this_cpu_highmem)
> > Use this_cpu operations to optimize access primitives for highmem.
> > 
> > The main effect is the avoidance of address calculations through the
> > use of a segment prefix.
> > 
> > Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > Signed-off-by: Christoph Lameter <cl@linux.com>
> > 
> > ---
> >  include/linux/highmem.h |    7 ++++---
> >  1 file changed, 4 insertions(+), 3 deletions(-)
> > 
> > Index: linux-2.6/include/linux/highmem.h
> > ===================================================================
> > --- linux-2.6.orig/include/linux/highmem.h	2010-11-22 14:43:40.000000000 -0600
> > +++ linux-2.6/include/linux/highmem.h	2010-11-22 14:45:02.000000000 -0600
> > @@ -81,7 +81,8 @@ DECLARE_PER_CPU(int, __kmap_atomic_idx);
> >  
> >  static inline int kmap_atomic_idx_push(void)
> >  {
> > -	int idx = __get_cpu_var(__kmap_atomic_idx)++;
> > +	int idx = __this_cpu_inc_return(__kmap_atomic_idx) - 1;
> > +
> >  #ifdef CONFIG_DEBUG_HIGHMEM
> >  	WARN_ON_ONCE(in_irq() && !irqs_disabled());
> >  	BUG_ON(idx > KM_TYPE_NR);
> > @@ -91,12 +92,12 @@ static inline int kmap_atomic_idx_push(v
> >  
> >  static inline int kmap_atomic_idx(void)
> >  {
> > -	return __get_cpu_var(__kmap_atomic_idx) - 1;
> > +	return __this_cpu_read(__kmap_atomic_idx) - 1;
> >  }
> >  
> >  static inline int kmap_atomic_idx_pop(void)
> >  {
> > -	int idx = --__get_cpu_var(__kmap_atomic_idx);
> > +	int idx = __this_cpu_dec_return(__kmap_atomic_idx);
> 
> __this_cpu_dec_return() is only needed if CONFIG_DEBUG_HIGHMEM
> 
> >  #ifdef CONFIG_DEBUG_HIGHMEM
> >  	BUG_ON(idx < 0);
> >  #endif
> > 
> 
> You could change kmap_atomic_idx_pop() to return void, and use
> __this_cpu_dec(__kmap_atomic_idx)

You can do the void change unconditionally, the debug code already uses
kmap_atomic_idx() because of:


---
commit 20273941f2129aa5a432796d98a276ed73d60782
Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
Date:   Wed Oct 27 15:32:58 2010 -0700

    mm: fix race in kunmap_atomic()
    
    Christoph reported a nice splat which illustrated a race in the new stack
    based kmap_atomic implementation.
    
    The problem is that we pop our stack slot before we're completely done
    resetting its state -- in particular clearing the PTE (sometimes that's
    CONFIG_DEBUG_HIGHMEM).  If an interrupt happens before we actually clear
    the PTE used for the last slot, that interrupt can reuse the slot in a
    dirty state, which triggers a BUG in kmap_atomic().
    
    Fix this by introducing kmap_atomic_idx() which reports the current slot
    index without actually releasing it and use that to find the PTE and delay
    the _pop() until after we're completely done.
    
    Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
    Reported-by: Christoph Hellwig <hch@infradead.org>
    Acked-by: Rik van Riel <riel@redhat.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

diff --git a/arch/arm/mm/highmem.c b/arch/arm/mm/highmem.c
index c00f119..c435fd9 100644
--- a/arch/arm/mm/highmem.c
+++ b/arch/arm/mm/highmem.c
@@ -89,7 +89,7 @@ void __kunmap_atomic(void *kvaddr)
 	int idx, type;
 
 	if (kvaddr >= (void *)FIXADDR_START) {
-		type = kmap_atomic_idx_pop();
+		type = kmap_atomic_idx();
 		idx = type + KM_TYPE_NR * smp_processor_id();
 
 		if (cache_is_vivt())
@@ -101,6 +101,7 @@ void __kunmap_atomic(void *kvaddr)
 #else
 		(void) idx;  /* to kill a warning */
 #endif
+		kmap_atomic_idx_pop();
 	} else if (vaddr >= PKMAP_ADDR(0) && vaddr < PKMAP_ADDR(LAST_PKMAP)) {
 		/* this address was obtained through kmap_high_get() */
 		kunmap_high(pte_page(pkmap_page_table[PKMAP_NR(vaddr)]));
diff --git a/arch/frv/mm/highmem.c b/arch/frv/mm/highmem.c
index 61088dc..fd7fcd4 100644
--- a/arch/frv/mm/highmem.c
+++ b/arch/frv/mm/highmem.c
@@ -68,7 +68,7 @@ EXPORT_SYMBOL(__kmap_atomic);
 
 void __kunmap_atomic(void *kvaddr)
 {
-	int type = kmap_atomic_idx_pop();
+	int type = kmap_atomic_idx();
 	switch (type) {
 	case 0:		__kunmap_atomic_primary(4, 6);	break;
 	case 1:		__kunmap_atomic_primary(5, 7);	break;
@@ -83,6 +83,7 @@ void __kunmap_atomic(void *kvaddr)
 	default:
 		BUG();
 	}
+	kmap_atomic_idx_pop();
 	pagefault_enable();
 }
 EXPORT_SYMBOL(__kunmap_atomic);
diff --git a/arch/mips/mm/highmem.c b/arch/mips/mm/highmem.c
index 1e69b1f..3634c7e 100644
--- a/arch/mips/mm/highmem.c
+++ b/arch/mips/mm/highmem.c
@@ -74,7 +74,7 @@ void __kunmap_atomic(void *kvaddr)
 		return;
 	}
 
-	type = kmap_atomic_idx_pop();
+	type = kmap_atomic_idx();
 #ifdef CONFIG_DEBUG_HIGHMEM
 	{
 		int idx = type + KM_TYPE_NR * smp_processor_id();
@@ -89,6 +89,7 @@ void __kunmap_atomic(void *kvaddr)
 		local_flush_tlb_one(vaddr);
 	}
 #endif
+	kmap_atomic_idx_pop();
 	pagefault_enable();
 }
 EXPORT_SYMBOL(__kunmap_atomic);
diff --git a/arch/mn10300/include/asm/highmem.h b/arch/mn10300/include/asm/highmem.h
index f577ba2..e2155e6 100644
--- a/arch/mn10300/include/asm/highmem.h
+++ b/arch/mn10300/include/asm/highmem.h
@@ -101,7 +101,7 @@ static inline void __kunmap_atomic(unsigned long vaddr)
 		return;
 	}
 
-	type = kmap_atomic_idx_pop();
+	type = kmap_atomic_idx();
 
 #if HIGHMEM_DEBUG
 	{
@@ -119,6 +119,8 @@ static inline void __kunmap_atomic(unsigned long vaddr)
 		__flush_tlb_one(vaddr);
 	}
 #endif
+
+	kmap_atomic_idx_pop();
 	pagefault_enable();
 }
 #endif /* __KERNEL__ */
diff --git a/arch/powerpc/mm/highmem.c b/arch/powerpc/mm/highmem.c
index b0848b4..e7450bd 100644
--- a/arch/powerpc/mm/highmem.c
+++ b/arch/powerpc/mm/highmem.c
@@ -62,7 +62,7 @@ void __kunmap_atomic(void *kvaddr)
 		return;
 	}
 
-	type = kmap_atomic_idx_pop();
+	type = kmap_atomic_idx();
 
 #ifdef CONFIG_DEBUG_HIGHMEM
 	{
@@ -79,6 +79,8 @@ void __kunmap_atomic(void *kvaddr)
 		local_flush_tlb_page(NULL, vaddr);
 	}
 #endif
+
+	kmap_atomic_idx_pop();
 	pagefault_enable();
 }
 EXPORT_SYMBOL(__kunmap_atomic);
diff --git a/arch/sparc/mm/highmem.c b/arch/sparc/mm/highmem.c
index 5e50c09..4730eac 100644
--- a/arch/sparc/mm/highmem.c
+++ b/arch/sparc/mm/highmem.c
@@ -75,7 +75,7 @@ void __kunmap_atomic(void *kvaddr)
 		return;
 	}
 
-	type = kmap_atomic_idx_pop();
+	type = kmap_atomic_idx();
 
 #ifdef CONFIG_DEBUG_HIGHMEM
 	{
@@ -104,6 +104,8 @@ void __kunmap_atomic(void *kvaddr)
 #endif
 	}
 #endif
+
+	kmap_atomic_idx_pop();
 	pagefault_enable();
 }
 EXPORT_SYMBOL(__kunmap_atomic);
diff --git a/arch/tile/mm/highmem.c b/arch/tile/mm/highmem.c
index 8ef6595..abb5733 100644
--- a/arch/tile/mm/highmem.c
+++ b/arch/tile/mm/highmem.c
@@ -241,7 +241,7 @@ void __kunmap_atomic(void *kvaddr)
 		pte_t pteval = *pte;
 		int idx, type;
 
-		type = kmap_atomic_idx_pop();
+		type = kmap_atomic_idx();
 		idx = type + KM_TYPE_NR*smp_processor_id();
 
 		/*
@@ -252,6 +252,7 @@ void __kunmap_atomic(void *kvaddr)
 		BUG_ON(!pte_present(pteval) && !pte_migrating(pteval));
 		kmap_atomic_unregister(pte_page(pteval), vaddr);
 		kpte_clear_flush(pte, vaddr);
+		kmap_atomic_idx_pop();
 	} else {
 		/* Must be a lowmem page */
 		BUG_ON(vaddr < PAGE_OFFSET);
diff --git a/arch/x86/mm/highmem_32.c b/arch/x86/mm/highmem_32.c
index d723e36..b499626 100644
--- a/arch/x86/mm/highmem_32.c
+++ b/arch/x86/mm/highmem_32.c
@@ -74,7 +74,7 @@ void __kunmap_atomic(void *kvaddr)
 	    vaddr <= __fix_to_virt(FIX_KMAP_BEGIN)) {
 		int idx, type;
 
-		type = kmap_atomic_idx_pop();
+		type = kmap_atomic_idx();
 		idx = type + KM_TYPE_NR * smp_processor_id();
 
 #ifdef CONFIG_DEBUG_HIGHMEM
@@ -87,6 +87,7 @@ void __kunmap_atomic(void *kvaddr)
 		 * attributes or becomes a protected page in a hypervisor.
 		 */
 		kpte_clear_flush(kmap_pte-idx, vaddr);
+		kmap_atomic_idx_pop();
 	}
 #ifdef CONFIG_DEBUG_HIGHMEM
 	else {
diff --git a/arch/x86/mm/iomap_32.c b/arch/x86/mm/iomap_32.c
index 75a3d7f..7b179b4 100644
--- a/arch/x86/mm/iomap_32.c
+++ b/arch/x86/mm/iomap_32.c
@@ -98,7 +98,7 @@ iounmap_atomic(void __iomem *kvaddr)
 	    vaddr <= __fix_to_virt(FIX_KMAP_BEGIN)) {
 		int idx, type;
 
-		type = kmap_atomic_idx_pop();
+		type = kmap_atomic_idx();
 		idx = type + KM_TYPE_NR * smp_processor_id();
 
 #ifdef CONFIG_DEBUG_HIGHMEM
@@ -111,6 +111,7 @@ iounmap_atomic(void __iomem *kvaddr)
 		 * attributes or becomes a protected page in a hypervisor.
 		 */
 		kpte_clear_flush(kmap_pte-idx, vaddr);
+		kmap_atomic_idx_pop();
 	}
 
 	pagefault_enable();
diff --git a/include/linux/highmem.h b/include/linux/highmem.h
index 102f76b..e913819 100644
--- a/include/linux/highmem.h
+++ b/include/linux/highmem.h
@@ -88,6 +88,11 @@ static inline int kmap_atomic_idx_push(void)
 	return idx;
 }
 
+static inline int kmap_atomic_idx(void)
+{
+	return __get_cpu_var(__kmap_atomic_idx) - 1;
+}
+
 static inline int kmap_atomic_idx_pop(void)
 {
 	int idx = --__get_cpu_var(__kmap_atomic_idx);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2010-11-30 19:19 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-30 19:07 [thisops uV3 00/18] Upgrade of this_cpu_ops V3 Christoph Lameter
2010-11-30 19:07 ` [thisops uV3 01/18] percpucounter: Optimize __percpu_counter_add a bit through the use of this_cpu() options Christoph Lameter
2010-11-30 19:07 ` [thisops uV3 02/18] vmstat: Optimize zone counter modifications through the use of this cpu operations Christoph Lameter
2010-11-30 19:07 ` [thisops uV3 03/18] percpu: Generic support for this_cpu_add,sub,dec,inc_return Christoph Lameter
2010-11-30 19:07 ` [thisops uV3 04/18] x86: Support " Christoph Lameter
2010-11-30 19:07 ` [thisops uV3 05/18] x86: Use this_cpu_inc_return for nmi counter Christoph Lameter
2010-11-30 19:07 ` [thisops uV3 06/18] vmstat: Use this_cpu_inc_return for vm statistics Christoph Lameter
2010-11-30 19:07 ` [thisops uV3 07/18] highmem: Use this_cpu_xx_return() operations Christoph Lameter
2010-11-30 19:13   ` Eric Dumazet
2010-11-30 19:19     ` Peter Zijlstra [this message]
2010-11-30 19:26     ` Christoph Lameter
2010-11-30 19:29       ` Eric Dumazet
2010-11-30 19:38         ` Peter Zijlstra
2010-11-30 19:53           ` Christoph Lameter
2010-11-30 19:07 ` [thisops uV3 08/18] Taskstats: Use this_cpu_ops Christoph Lameter
2010-12-01 18:06   ` Michael Holzheu
2010-12-01 18:13     ` Christoph Lameter
2010-12-06 14:32       ` Balbir Singh
2010-12-07 14:39         ` Christoph Lameter
2010-11-30 19:07 ` [thisops uV3 09/18] fs: Use this_cpu_xx operations in buffer.c Christoph Lameter
2010-11-30 19:07 ` [thisops uV3 10/18] x86: Use this_cpu_ops to optimize code Christoph Lameter
2010-11-30 19:07 ` [thisops uV3 11/18] x86: Use this_cpu_ops for current_cpu_data accesses Christoph Lameter
2010-11-30 19:07 ` [thisops uV3 12/18] Core: Replace __get_cpu_var with __this_cpu_read if not used for an address Christoph Lameter
2010-11-30 19:07 ` [thisops uV3 13/18] drivers: " Christoph Lameter
2010-11-30 19:07 ` [thisops uV3 14/18] lguest: Use this_cpu_ops Christoph Lameter
2010-12-06  7:46   ` Rusty Russell
2010-12-06 15:54     ` Christoph Lameter
2010-12-06 16:27       ` Christoph Lameter
2010-11-30 19:07 ` [thisops uV3 15/18] Xen: " Christoph Lameter
2010-11-30 20:53   ` Jeremy Fitzhardinge
2010-11-30 21:03     ` Christoph Lameter
2010-11-30 19:07 ` [thisops uV3 16/18] kprobes: " Christoph Lameter
2010-11-30 19:07 ` [thisops uV3 17/18] Connector: Use this_cpu operations Christoph Lameter
2010-11-30 19:07 ` [thisops uV3 18/18] Fakekey: Simplify speakup_fake_key_pressed through this_cpu_ops Christoph Lameter
2010-11-30 20:05 ` [extra] timers: Use this_cpu_read Christoph Lameter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1291144782.32004.1146.camel@laptop \
    --to=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=catalin.marinas@arm.com \
    --cc=cl@linux.com \
    --cc=eric.dumazet@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=penberg@cs.helsinki.fi \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).