* [git pull] x86: fix global_flush_tlb() bug
@ 2007-10-19 10:48 Ingo Molnar
2007-10-19 12:05 ` Andi Kleen
0 siblings, 1 reply; 2+ messages in thread
From: Ingo Molnar @ 2007-10-19 10:48 UTC (permalink / raw)
To: Linus Torvalds
Cc: Thomas Gleixner, Greg KH, Chris Wright, Andi Kleen, Andrew Morton,
Jan Beulich, linux-kernel
find a fix for a pretty serious global_flush_tlb() x86-64 bug below,
-stable candidate too i think.
Linus, please pull this fix from the x86 git tree:
ssh://master.kernel.org/pub/scm/linux/kernel/git/tglx/linux-2.6-x86.git
|
| Ingo Molnar (1):
| x86: fix global_flush_tlb() bug
thanks,
Ingo
------------------>
Subject: x86: fix global_flush_tlb() bug
From: Ingo Molnar <mingo@elte.hu>
While we were reviewing pageattr_32/64.c for unification,
Thomas Gleixner noticed the following serious SMP bug in
global_flush_tlb():
down_read(&init_mm.mmap_sem);
list_replace_init(&deferred_pages, &l);
up_read(&init_mm.mmap_sem);
this is SMP-unsafe because list_replace_init() done on two CPUs in
parallel can corrupt the list.
This bug has been introduced about a year ago in the 64-bit tree:
commit ea7322decb974a4a3e804f96a0201e893ff88ce3
Author: Andi Kleen <ak@suse.de>
Date: Thu Dec 7 02:14:05 2006 +0100
[PATCH] x86-64: Speed and clean up cache flushing in change_page_attr
down_read(&init_mm.mmap_sem);
- dpage = xchg(&deferred_pages, NULL);
+ list_replace_init(&deferred_pages, &l);
up_read(&init_mm.mmap_sem);
the xchg() based version was SMP-safe, but list_replace_init() is not.
So this "cleanup" introduced a nasty bug.
why this bug never become prominent is a mystery - it can probably be
explained with the (still) relative obscurity of the x86_64 architecture.
the safe fix for now is to write-lock init_mm.mmap_sem.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
arch/x86/mm/pageattr_64.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
Index: linux/arch/x86/mm/pageattr_64.c
===================================================================
--- linux.orig/arch/x86/mm/pageattr_64.c
+++ linux/arch/x86/mm/pageattr_64.c
@@ -255,9 +255,14 @@ void global_flush_tlb(void)
struct page *pg, *next;
struct list_head l;
- down_read(&init_mm.mmap_sem);
+ /*
+ * Write-protect the semaphore, to exclude two contexts
+ * doing a list_replace_init() call in parallel and to
+ * exclude new additions to the deferred_pages list:
+ */
+ down_write(&init_mm.mmap_sem);
list_replace_init(&deferred_pages, &l);
- up_read(&init_mm.mmap_sem);
+ up_write(&init_mm.mmap_sem);
flush_map(&l);
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [git pull] x86: fix global_flush_tlb() bug
2007-10-19 10:48 [git pull] x86: fix global_flush_tlb() bug Ingo Molnar
@ 2007-10-19 12:05 ` Andi Kleen
0 siblings, 0 replies; 2+ messages in thread
From: Andi Kleen @ 2007-10-19 12:05 UTC (permalink / raw)
To: Ingo Molnar
Cc: Linus Torvalds, Thomas Gleixner, Greg KH, Chris Wright,
Andi Kleen, Andrew Morton, Jan Beulich, linux-kernel
Thanks for catching.
> why this bug never become prominent is a mystery - it can probably be
> explained with the (still) relative obscurity of the x86_64 architecture.
global_flush_tlb() is not very common in the big scheme of things. In a normal
system it only happens single threaded during X server startup and when
the system starts.
So while it's nasty it's unlikely to really hit people in practice.
BTW while looking I noticed this code in the vermilion driver is also
surely not correct:
/*
* Change caching policy of the linear kernel map to avoid
* mapping type conflicts with user-space mappings.
* The first global_flush_tlb() is really only there to do a global
* wbinvd().
*/
global_flush_tlb();
That is not what gft is guaranteed to do.
It would be probably best to just do away with g_f_t() and fold it directly into
c_p_a(). I've seen little evidence the delayed flush optimization ever made
much difference and it seems to be misused and a source of bugs. And near all
legitimate users seem to always call it directly after c_p_a() anyways.
Besides it is grossly misnamed -- it does much more than flushing TLBs.
-Andi
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2007-10-19 12:06 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-10-19 10:48 [git pull] x86: fix global_flush_tlb() bug Ingo Molnar
2007-10-19 12:05 ` Andi Kleen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox