From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Christoph Lameter <cl@linux.com>,
akpm@linux-foundation.org, Pekka Enberg <penberg@cs.helsinki.fi>,
Catalin Marinas <catalin.marinas@arm.com>,
linux-kernel@vger.kernel.org,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
Tejun Heo <tj@kernel.org>,
linux-mm@kvack.org
Subject: Re: [thisops uV3 07/18] highmem: Use this_cpu_xx_return() operations
Date: Tue, 30 Nov 2010 20:19:42 +0100 [thread overview]
Message-ID: <1291144782.32004.1146.camel@laptop> (raw)
In-Reply-To: <1291144408.2904.232.camel@edumazet-laptop>
On Tue, 2010-11-30 at 20:13 +0100, Eric Dumazet wrote:
> Le mardi 30 novembre 2010 à 13:07 -0600, Christoph Lameter a écrit :
> > pièce jointe document texte brut (this_cpu_highmem)
> > Use this_cpu operations to optimize access primitives for highmem.
> >
> > The main effect is the avoidance of address calculations through the
> > use of a segment prefix.
> >
> > Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > Signed-off-by: Christoph Lameter <cl@linux.com>
> >
> > ---
> > include/linux/highmem.h | 7 ++++---
> > 1 file changed, 4 insertions(+), 3 deletions(-)
> >
> > Index: linux-2.6/include/linux/highmem.h
> > ===================================================================
> > --- linux-2.6.orig/include/linux/highmem.h 2010-11-22 14:43:40.000000000 -0600
> > +++ linux-2.6/include/linux/highmem.h 2010-11-22 14:45:02.000000000 -0600
> > @@ -81,7 +81,8 @@ DECLARE_PER_CPU(int, __kmap_atomic_idx);
> >
> > static inline int kmap_atomic_idx_push(void)
> > {
> > - int idx = __get_cpu_var(__kmap_atomic_idx)++;
> > + int idx = __this_cpu_inc_return(__kmap_atomic_idx) - 1;
> > +
> > #ifdef CONFIG_DEBUG_HIGHMEM
> > WARN_ON_ONCE(in_irq() && !irqs_disabled());
> > BUG_ON(idx > KM_TYPE_NR);
> > @@ -91,12 +92,12 @@ static inline int kmap_atomic_idx_push(v
> >
> > static inline int kmap_atomic_idx(void)
> > {
> > - return __get_cpu_var(__kmap_atomic_idx) - 1;
> > + return __this_cpu_read(__kmap_atomic_idx) - 1;
> > }
> >
> > static inline int kmap_atomic_idx_pop(void)
> > {
> > - int idx = --__get_cpu_var(__kmap_atomic_idx);
> > + int idx = __this_cpu_dec_return(__kmap_atomic_idx);
>
> __this_cpu_dec_return() is only needed if CONFIG_DEBUG_HIGHMEM
>
> > #ifdef CONFIG_DEBUG_HIGHMEM
> > BUG_ON(idx < 0);
> > #endif
> >
>
> You could change kmap_atomic_idx_pop() to return void, and use
> __this_cpu_dec(__kmap_atomic_idx)
You can do the void change unconditionally, the debug code already uses
kmap_atomic_idx() because of:
---
commit 20273941f2129aa5a432796d98a276ed73d60782
Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
Date: Wed Oct 27 15:32:58 2010 -0700
mm: fix race in kunmap_atomic()
Christoph reported a nice splat which illustrated a race in the new stack
based kmap_atomic implementation.
The problem is that we pop our stack slot before we're completely done
resetting its state -- in particular clearing the PTE (sometimes that's
CONFIG_DEBUG_HIGHMEM). If an interrupt happens before we actually clear
the PTE used for the last slot, that interrupt can reuse the slot in a
dirty state, which triggers a BUG in kmap_atomic().
Fix this by introducing kmap_atomic_idx() which reports the current slot
index without actually releasing it and use that to find the PTE and delay
the _pop() until after we're completely done.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reported-by: Christoph Hellwig <hch@infradead.org>
Acked-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
diff --git a/arch/arm/mm/highmem.c b/arch/arm/mm/highmem.c
index c00f119..c435fd9 100644
--- a/arch/arm/mm/highmem.c
+++ b/arch/arm/mm/highmem.c
@@ -89,7 +89,7 @@ void __kunmap_atomic(void *kvaddr)
int idx, type;
if (kvaddr >= (void *)FIXADDR_START) {
- type = kmap_atomic_idx_pop();
+ type = kmap_atomic_idx();
idx = type + KM_TYPE_NR * smp_processor_id();
if (cache_is_vivt())
@@ -101,6 +101,7 @@ void __kunmap_atomic(void *kvaddr)
#else
(void) idx; /* to kill a warning */
#endif
+ kmap_atomic_idx_pop();
} else if (vaddr >= PKMAP_ADDR(0) && vaddr < PKMAP_ADDR(LAST_PKMAP)) {
/* this address was obtained through kmap_high_get() */
kunmap_high(pte_page(pkmap_page_table[PKMAP_NR(vaddr)]));
diff --git a/arch/frv/mm/highmem.c b/arch/frv/mm/highmem.c
index 61088dc..fd7fcd4 100644
--- a/arch/frv/mm/highmem.c
+++ b/arch/frv/mm/highmem.c
@@ -68,7 +68,7 @@ EXPORT_SYMBOL(__kmap_atomic);
void __kunmap_atomic(void *kvaddr)
{
- int type = kmap_atomic_idx_pop();
+ int type = kmap_atomic_idx();
switch (type) {
case 0: __kunmap_atomic_primary(4, 6); break;
case 1: __kunmap_atomic_primary(5, 7); break;
@@ -83,6 +83,7 @@ void __kunmap_atomic(void *kvaddr)
default:
BUG();
}
+ kmap_atomic_idx_pop();
pagefault_enable();
}
EXPORT_SYMBOL(__kunmap_atomic);
diff --git a/arch/mips/mm/highmem.c b/arch/mips/mm/highmem.c
index 1e69b1f..3634c7e 100644
--- a/arch/mips/mm/highmem.c
+++ b/arch/mips/mm/highmem.c
@@ -74,7 +74,7 @@ void __kunmap_atomic(void *kvaddr)
return;
}
- type = kmap_atomic_idx_pop();
+ type = kmap_atomic_idx();
#ifdef CONFIG_DEBUG_HIGHMEM
{
int idx = type + KM_TYPE_NR * smp_processor_id();
@@ -89,6 +89,7 @@ void __kunmap_atomic(void *kvaddr)
local_flush_tlb_one(vaddr);
}
#endif
+ kmap_atomic_idx_pop();
pagefault_enable();
}
EXPORT_SYMBOL(__kunmap_atomic);
diff --git a/arch/mn10300/include/asm/highmem.h b/arch/mn10300/include/asm/highmem.h
index f577ba2..e2155e6 100644
--- a/arch/mn10300/include/asm/highmem.h
+++ b/arch/mn10300/include/asm/highmem.h
@@ -101,7 +101,7 @@ static inline void __kunmap_atomic(unsigned long vaddr)
return;
}
- type = kmap_atomic_idx_pop();
+ type = kmap_atomic_idx();
#if HIGHMEM_DEBUG
{
@@ -119,6 +119,8 @@ static inline void __kunmap_atomic(unsigned long vaddr)
__flush_tlb_one(vaddr);
}
#endif
+
+ kmap_atomic_idx_pop();
pagefault_enable();
}
#endif /* __KERNEL__ */
diff --git a/arch/powerpc/mm/highmem.c b/arch/powerpc/mm/highmem.c
index b0848b4..e7450bd 100644
--- a/arch/powerpc/mm/highmem.c
+++ b/arch/powerpc/mm/highmem.c
@@ -62,7 +62,7 @@ void __kunmap_atomic(void *kvaddr)
return;
}
- type = kmap_atomic_idx_pop();
+ type = kmap_atomic_idx();
#ifdef CONFIG_DEBUG_HIGHMEM
{
@@ -79,6 +79,8 @@ void __kunmap_atomic(void *kvaddr)
local_flush_tlb_page(NULL, vaddr);
}
#endif
+
+ kmap_atomic_idx_pop();
pagefault_enable();
}
EXPORT_SYMBOL(__kunmap_atomic);
diff --git a/arch/sparc/mm/highmem.c b/arch/sparc/mm/highmem.c
index 5e50c09..4730eac 100644
--- a/arch/sparc/mm/highmem.c
+++ b/arch/sparc/mm/highmem.c
@@ -75,7 +75,7 @@ void __kunmap_atomic(void *kvaddr)
return;
}
- type = kmap_atomic_idx_pop();
+ type = kmap_atomic_idx();
#ifdef CONFIG_DEBUG_HIGHMEM
{
@@ -104,6 +104,8 @@ void __kunmap_atomic(void *kvaddr)
#endif
}
#endif
+
+ kmap_atomic_idx_pop();
pagefault_enable();
}
EXPORT_SYMBOL(__kunmap_atomic);
diff --git a/arch/tile/mm/highmem.c b/arch/tile/mm/highmem.c
index 8ef6595..abb5733 100644
--- a/arch/tile/mm/highmem.c
+++ b/arch/tile/mm/highmem.c
@@ -241,7 +241,7 @@ void __kunmap_atomic(void *kvaddr)
pte_t pteval = *pte;
int idx, type;
- type = kmap_atomic_idx_pop();
+ type = kmap_atomic_idx();
idx = type + KM_TYPE_NR*smp_processor_id();
/*
@@ -252,6 +252,7 @@ void __kunmap_atomic(void *kvaddr)
BUG_ON(!pte_present(pteval) && !pte_migrating(pteval));
kmap_atomic_unregister(pte_page(pteval), vaddr);
kpte_clear_flush(pte, vaddr);
+ kmap_atomic_idx_pop();
} else {
/* Must be a lowmem page */
BUG_ON(vaddr < PAGE_OFFSET);
diff --git a/arch/x86/mm/highmem_32.c b/arch/x86/mm/highmem_32.c
index d723e36..b499626 100644
--- a/arch/x86/mm/highmem_32.c
+++ b/arch/x86/mm/highmem_32.c
@@ -74,7 +74,7 @@ void __kunmap_atomic(void *kvaddr)
vaddr <= __fix_to_virt(FIX_KMAP_BEGIN)) {
int idx, type;
- type = kmap_atomic_idx_pop();
+ type = kmap_atomic_idx();
idx = type + KM_TYPE_NR * smp_processor_id();
#ifdef CONFIG_DEBUG_HIGHMEM
@@ -87,6 +87,7 @@ void __kunmap_atomic(void *kvaddr)
* attributes or becomes a protected page in a hypervisor.
*/
kpte_clear_flush(kmap_pte-idx, vaddr);
+ kmap_atomic_idx_pop();
}
#ifdef CONFIG_DEBUG_HIGHMEM
else {
diff --git a/arch/x86/mm/iomap_32.c b/arch/x86/mm/iomap_32.c
index 75a3d7f..7b179b4 100644
--- a/arch/x86/mm/iomap_32.c
+++ b/arch/x86/mm/iomap_32.c
@@ -98,7 +98,7 @@ iounmap_atomic(void __iomem *kvaddr)
vaddr <= __fix_to_virt(FIX_KMAP_BEGIN)) {
int idx, type;
- type = kmap_atomic_idx_pop();
+ type = kmap_atomic_idx();
idx = type + KM_TYPE_NR * smp_processor_id();
#ifdef CONFIG_DEBUG_HIGHMEM
@@ -111,6 +111,7 @@ iounmap_atomic(void __iomem *kvaddr)
* attributes or becomes a protected page in a hypervisor.
*/
kpte_clear_flush(kmap_pte-idx, vaddr);
+ kmap_atomic_idx_pop();
}
pagefault_enable();
diff --git a/include/linux/highmem.h b/include/linux/highmem.h
index 102f76b..e913819 100644
--- a/include/linux/highmem.h
+++ b/include/linux/highmem.h
@@ -88,6 +88,11 @@ static inline int kmap_atomic_idx_push(void)
return idx;
}
+static inline int kmap_atomic_idx(void)
+{
+ return __get_cpu_var(__kmap_atomic_idx) - 1;
+}
+
static inline int kmap_atomic_idx_pop(void)
{
int idx = --__get_cpu_var(__kmap_atomic_idx);
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2010-11-30 19:19 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-11-30 19:07 [thisops uV3 00/18] Upgrade of this_cpu_ops V3 Christoph Lameter
2010-11-30 19:07 ` [thisops uV3 01/18] percpucounter: Optimize __percpu_counter_add a bit through the use of this_cpu() options Christoph Lameter
2010-11-30 19:07 ` [thisops uV3 02/18] vmstat: Optimize zone counter modifications through the use of this cpu operations Christoph Lameter
2010-11-30 19:07 ` [thisops uV3 03/18] percpu: Generic support for this_cpu_add,sub,dec,inc_return Christoph Lameter
2010-11-30 19:07 ` [thisops uV3 04/18] x86: Support " Christoph Lameter
2010-11-30 19:07 ` [thisops uV3 05/18] x86: Use this_cpu_inc_return for nmi counter Christoph Lameter
2010-11-30 19:07 ` [thisops uV3 06/18] vmstat: Use this_cpu_inc_return for vm statistics Christoph Lameter
2010-11-30 19:07 ` [thisops uV3 07/18] highmem: Use this_cpu_xx_return() operations Christoph Lameter
2010-11-30 19:13 ` Eric Dumazet
2010-11-30 19:19 ` Peter Zijlstra [this message]
2010-11-30 19:26 ` Christoph Lameter
2010-11-30 19:29 ` Eric Dumazet
2010-11-30 19:38 ` Peter Zijlstra
2010-11-30 19:53 ` Christoph Lameter
2010-11-30 19:07 ` [thisops uV3 08/18] Taskstats: Use this_cpu_ops Christoph Lameter
2010-12-01 18:06 ` Michael Holzheu
2010-12-01 18:13 ` Christoph Lameter
2010-12-06 14:32 ` Balbir Singh
2010-12-07 14:39 ` Christoph Lameter
2010-11-30 19:07 ` [thisops uV3 09/18] fs: Use this_cpu_xx operations in buffer.c Christoph Lameter
2010-11-30 19:07 ` [thisops uV3 10/18] x86: Use this_cpu_ops to optimize code Christoph Lameter
2010-11-30 19:07 ` [thisops uV3 11/18] x86: Use this_cpu_ops for current_cpu_data accesses Christoph Lameter
2010-11-30 19:07 ` [thisops uV3 12/18] Core: Replace __get_cpu_var with __this_cpu_read if not used for an address Christoph Lameter
2010-11-30 19:07 ` [thisops uV3 13/18] drivers: " Christoph Lameter
2010-11-30 19:07 ` [thisops uV3 14/18] lguest: Use this_cpu_ops Christoph Lameter
2010-12-06 7:46 ` Rusty Russell
2010-12-06 15:54 ` Christoph Lameter
2010-12-06 16:27 ` Christoph Lameter
2010-11-30 19:07 ` [thisops uV3 15/18] Xen: " Christoph Lameter
2010-11-30 20:53 ` Jeremy Fitzhardinge
2010-11-30 21:03 ` Christoph Lameter
2010-11-30 19:07 ` [thisops uV3 16/18] kprobes: " Christoph Lameter
2010-11-30 19:07 ` [thisops uV3 17/18] Connector: Use this_cpu operations Christoph Lameter
2010-11-30 19:07 ` [thisops uV3 18/18] Fakekey: Simplify speakup_fake_key_pressed through this_cpu_ops Christoph Lameter
2010-11-30 20:05 ` [extra] timers: Use this_cpu_read Christoph Lameter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1291144782.32004.1146.camel@laptop \
--to=a.p.zijlstra@chello.nl \
--cc=akpm@linux-foundation.org \
--cc=catalin.marinas@arm.com \
--cc=cl@linux.com \
--cc=eric.dumazet@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=penberg@cs.helsinki.fi \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).