* [BUG] 2.6.1/MIPS - missing cache flushing when user program returns pages to kernel
@ 2004-01-15 0:39 Jun Sun
2004-01-15 1:12 ` Andrew Morton
0 siblings, 1 reply; 8+ messages in thread
From: Jun Sun @ 2004-01-15 0:39 UTC (permalink / raw)
To: linux-mips, linux-kernel; +Cc: jsun
[-- Attachment #1: Type: text/plain, Size: 2947 bytes --]
I have been chasing a nasty memory corruption bug on my MIPS box with
2.6.1 kernel. In the end it appears the following sequence has
happened:
1. userland gets a page and writes some stuff to it, which dirties
data cache. In my case, it is actually doing a sys_read() into
that page. See my kgdb trace attached in the end.
2. userland returns this page to kernel *without* any cache flushing,
i.e., the dcache is still dirty.
3. kernel calls kmalloc() to get a block from this page.
4. the dirty dcache is written back to physical memory some time later,
corrupting the kernel data.
It seems to me the problem is that we should do a cache flush
for all the pages returned to kernel during step 2.
I attached a hack which solves my problem but I am not sure if it is
most appropriate. It looks like the affected user region (start, end)
can span over multiple vma areas. If so, the fix will only flush the first
area.
Also, it is hard to find an appropriate place to do the flushing
The new 2.6 mm is a confusing maze to me. I hope someone more
knowledgable can come up with a more decent fix for this problem.
BTW, it appears in 2.4 we are doing this flushing in do_zap_page_range()
where we call a flush_cache_range(mm, start, end).
Jun
P.S., Here is the stack trace when dirty data is written to user page:
$8 = 0x2aac3000
(gdb) p/x kaddr
$9 = 0x811a3880
(gdb) bt
#0 both_aligned () at arch/mips/lib/memcpy.S:222
#1 0xffffffff8014351c in file_read_actor (desc=0x87fd1dd0, page=0x8102c178,
offset=0, size=2717) at mm/filemap.c:754
#2 0xffffffff80143088 in do_generic_mapping_read (mapping=0x803e3168,
ra=0x87fe8868, filp=0x87fe8820, ppos=0x87fe8840, desc=0x87fd1dd0,
actor=0x80143480 <file_read_actor>) at mm/filemap.c:658
#3 0xffffffff801437a0 in __generic_file_aio_read (iocb=0x80143480,
iov=0x811a3880, nr_segs=1, ppos=0x87fe8840) at fs.h:1334
#4 0xffffffff80143884 in generic_file_read (filp=0x61697265,
buf=0xcccccccd <Address 0xcccccccd out of bounds>, count=715927552,
ppos=0xa9d) at mm/filemap.c:877
#5 0xffffffff80162688 in vfs_read (file=0x87fe8820,
buf=0x2aac3000 "# /etc/inittab: init(8) configuration.\n# $Id: inittab,v 1.8 1998/05/10 10:37:50 miquels Exp $\n\n# The default runlevel.\nid:3:initdefault:\n\n# Boot-time system configuration/initialization script.\n# This"...,
count=4096, pos=0x87fe8840) at fs/read_write.c:213
#6 0xffffffff80162918 in sys_read (fd=715929728,
buf=0x2aac3000 "# /etc/inittab: init(8) configuration.\n# $Id: inittab,v 1.8 1998/05/10 10:37:50 miquels Exp $\n\n# The default runlevel.\nid:3:initdefault:\n\n# Boot-time system configuration/initialization script.\n# This"...,
count=4096) at fs/read_write.c:278
It appears I lost the stack trace when kernel finds data corruption. It is somewhere
inside d_splice_alias() where inode->i_dentry, happens to be at 0x811a3880, has
a wrong value instead of the expected 0.
[-- Attachment #2: missing-cache-flush-on-return-user-page.patch --]
[-- Type: text/plain, Size: 372 bytes --]
diff -Nru linux/mm/mmap.c.orig linux/mm/mmap.c
--- linux/mm/mmap.c.orig Mon Jan 12 16:31:22 2004
+++ linux/mm/mmap.c Wed Jan 14 14:22:20 2004
@@ -1134,6 +1134,8 @@
struct mmu_gather *tlb;
unsigned long nr_accounted = 0;
+ flush_cache_range(vma, start, end);
+
lru_add_drain();
tlb = tlb_gather_mmu(mm, 0);
unmap_vmas(&tlb, mm, vma, start, end, &nr_accounted);
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [BUG] 2.6.1/MIPS - missing cache flushing when user program returns pages to kernel
2004-01-15 0:39 [BUG] 2.6.1/MIPS - missing cache flushing when user program returns pages to kernel Jun Sun
@ 2004-01-15 1:12 ` Andrew Morton
2004-01-15 1:12 ` Andrew Morton
2004-01-15 1:29 ` Andrew Morton
0 siblings, 2 replies; 8+ messages in thread
From: Andrew Morton @ 2004-01-15 1:12 UTC (permalink / raw)
To: Jun Sun; +Cc: linux-mips, linux-kernel, jsun, Russell King
Jun Sun <jsun@mvista.com> wrote:
>
> I have been chasing a nasty memory corruption bug on my MIPS box with
> 2.6.1 kernel. In the end it appears the following sequence has
> happened:
>
> 1. userland gets a page and writes some stuff to it, which dirties
> data cache. In my case, it is actually doing a sys_read() into
> that page. See my kgdb trace attached in the end.
>
> 2. userland returns this page to kernel *without* any cache flushing,
> i.e., the dcache is still dirty.
>
> 3. kernel calls kmalloc() to get a block from this page.
>
> 4. the dirty dcache is written back to physical memory some time later,
> corrupting the kernel data.
>
> It seems to me the problem is that we should do a cache flush
> for all the pages returned to kernel during step 2.
>
> I attached a hack which solves my problem but I am not sure if it is
> most appropriate. It looks like the affected user region (start, end)
> can span over multiple vma areas. If so, the fix will only flush the first
> area.
>
> Also, it is hard to find an appropriate place to do the flushing
> The new 2.6 mm is a confusing maze to me. I hope someone more
> knowledgable can come up with a more decent fix for this problem.
>
> BTW, it appears in 2.4 we are doing this flushing in do_zap_page_range()
> where we call a flush_cache_range(mm, start, end).
That flush_cache_range was removed between 2.5.67 and 2.5.68. If you put
it back, does it fix the problem?
It seems from Russell's words here, MIPS should be flushing in
tlb_start_vma().
I think that's wrong, really. We've discussed this before and decided that
these flushing operations should be open-coded in the main .c file rather
than embedded in arch functions which happen to undocumentedly do other
stuff.
# --------------------------------------------
# 03/04/14 rmk@arm.linux.org.uk 1.1017
# [PATCH] flush_cache_mm in zap_page_range
#
# unmap_vmas() eventually calls tlb_start_vma(), where most architectures
# flush caches as necessary. The flush here seems to make the
# flush_cache_range() in zap_page_range() redundant, and therefore can be
# removed.
# --------------------------------------------
#
diff -Nru a/mm/memory.c b/mm/memory.c
--- a/mm/memory.c Wed Jan 14 17:09:07 2004
+++ b/mm/memory.c Wed Jan 14 17:09:07 2004
@@ -601,7 +601,6 @@
lru_add_drain();
spin_lock(&mm->page_table_lock);
- flush_cache_range(vma, address, end);
tlb = tlb_gather_mmu(mm, 0);
unmap_vmas(&tlb, mm, vma, address, end, &nr_accounted);
tlb_finish_mmu(tlb, address, end);
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [BUG] 2.6.1/MIPS - missing cache flushing when user program returns pages to kernel
2004-01-15 1:12 ` Andrew Morton
@ 2004-01-15 1:12 ` Andrew Morton
2004-01-15 1:29 ` Andrew Morton
1 sibling, 0 replies; 8+ messages in thread
From: Andrew Morton @ 2004-01-15 1:12 UTC (permalink / raw)
To: Jun Sun; +Cc: linux-mips, linux-kernel, Russell King
Jun Sun <jsun@mvista.com> wrote:
>
> I have been chasing a nasty memory corruption bug on my MIPS box with
> 2.6.1 kernel. In the end it appears the following sequence has
> happened:
>
> 1. userland gets a page and writes some stuff to it, which dirties
> data cache. In my case, it is actually doing a sys_read() into
> that page. See my kgdb trace attached in the end.
>
> 2. userland returns this page to kernel *without* any cache flushing,
> i.e., the dcache is still dirty.
>
> 3. kernel calls kmalloc() to get a block from this page.
>
> 4. the dirty dcache is written back to physical memory some time later,
> corrupting the kernel data.
>
> It seems to me the problem is that we should do a cache flush
> for all the pages returned to kernel during step 2.
>
> I attached a hack which solves my problem but I am not sure if it is
> most appropriate. It looks like the affected user region (start, end)
> can span over multiple vma areas. If so, the fix will only flush the first
> area.
>
> Also, it is hard to find an appropriate place to do the flushing
> The new 2.6 mm is a confusing maze to me. I hope someone more
> knowledgable can come up with a more decent fix for this problem.
>
> BTW, it appears in 2.4 we are doing this flushing in do_zap_page_range()
> where we call a flush_cache_range(mm, start, end).
That flush_cache_range was removed between 2.5.67 and 2.5.68. If you put
it back, does it fix the problem?
It seems from Russell's words here, MIPS should be flushing in
tlb_start_vma().
I think that's wrong, really. We've discussed this before and decided that
these flushing operations should be open-coded in the main .c file rather
than embedded in arch functions which happen to undocumentedly do other
stuff.
# --------------------------------------------
# 03/04/14 rmk@arm.linux.org.uk 1.1017
# [PATCH] flush_cache_mm in zap_page_range
#
# unmap_vmas() eventually calls tlb_start_vma(), where most architectures
# flush caches as necessary. The flush here seems to make the
# flush_cache_range() in zap_page_range() redundant, and therefore can be
# removed.
# --------------------------------------------
#
diff -Nru a/mm/memory.c b/mm/memory.c
--- a/mm/memory.c Wed Jan 14 17:09:07 2004
+++ b/mm/memory.c Wed Jan 14 17:09:07 2004
@@ -601,7 +601,6 @@
lru_add_drain();
spin_lock(&mm->page_table_lock);
- flush_cache_range(vma, address, end);
tlb = tlb_gather_mmu(mm, 0);
unmap_vmas(&tlb, mm, vma, address, end, &nr_accounted);
tlb_finish_mmu(tlb, address, end);
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [BUG] 2.6.1/MIPS - missing cache flushing when user program returns pages to kernel
2004-01-15 1:12 ` Andrew Morton
2004-01-15 1:12 ` Andrew Morton
@ 2004-01-15 1:29 ` Andrew Morton
2004-01-15 1:40 ` Jun Sun
1 sibling, 1 reply; 8+ messages in thread
From: Andrew Morton @ 2004-01-15 1:29 UTC (permalink / raw)
To: jsun, linux-mips, linux-kernel, rmk
Andrew Morton <akpm@osdl.org> wrote:
>
> I think that's wrong, really. We've discussed this before and decided that
> these flushing operations should be open-coded in the main .c file rather
> than embedded in arch functions which happen to undocumentedly do other
> stuff.
err, OK, I give up. Lots of architectures do the cache flush in
tlb_start_vma(). I guess mips may as well do the same.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [BUG] 2.6.1/MIPS - missing cache flushing when user program returns pages to kernel
2004-01-15 1:29 ` Andrew Morton
@ 2004-01-15 1:40 ` Jun Sun
2004-01-15 6:23 ` David S. Miller
0 siblings, 1 reply; 8+ messages in thread
From: Jun Sun @ 2004-01-15 1:40 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-mips, linux-kernel, rmk, jsun
On Wed, Jan 14, 2004 at 05:29:46PM -0800, Andrew Morton wrote:
> Andrew Morton <akpm@osdl.org> wrote:
> >
> > I think that's wrong, really. We've discussed this before and decided that
> > these flushing operations should be open-coded in the main .c file rather
> > than embedded in arch functions which happen to undocumentedly do other
> > stuff.
>
> err, OK, I give up. Lots of architectures do the cache flush in
> tlb_start_vma(). I guess mips may as well do the same.
>
Looking at my tree (which is from linux-mips.org), it appears
arm, sparc, sparc64, and sh have tlb_start_vma() defined to call
cache flushing.
What exactly does tlb_start_vma()/tlb_end_vma() mean? There is
only one invocation instance, which is significant enough to infer
the meaning. :)
BTW, either my original hack or putting a cache flush in tlb_start_vma()
solves my problem. They are really doing the same thing, just at
different places.
Jun
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [BUG] 2.6.1/MIPS - missing cache flushing when user program returns pages to kernel
2004-01-15 1:40 ` Jun Sun
@ 2004-01-15 6:23 ` David S. Miller
2004-01-15 6:23 ` David S. Miller
2004-01-15 18:03 ` Jun Sun
0 siblings, 2 replies; 8+ messages in thread
From: David S. Miller @ 2004-01-15 6:23 UTC (permalink / raw)
To: Jun Sun; +Cc: akpm, linux-mips, linux-kernel, rmk, jsun
On Wed, 14 Jan 2004 17:40:12 -0800
Jun Sun <jsun@mvista.com> wrote:
> Looking at my tree (which is from linux-mips.org), it appears
> arm, sparc, sparc64, and sh have tlb_start_vma() defined to call
> cache flushing.
Correct, in fact every platform where cache flushing matters
at all (ie. where flush_cache_*() routines actually need to
flush a cpu cache), they should have tlb_start_vma() do such
a flush.
> What exactly does tlb_start_vma()/tlb_end_vma() mean? There is
> only one invocation instance, which is significant enough to infer
> the meaning. :)
When the kernel unmaps a mmap region of a process (either for the
sake of munmap() or tearing down all mapping during exit()) tlb_start_vma()
is called, the page table mappings in the region are torn down one by
one, then a tlb_end_vma() call is made.
At the top level, ie. whoever invokes unmap_page_range(), there will
be a tlb_gather_mmu() call.
In order to properly optimize the cache flushes, most platforms do the
following:
1) The tlb->fullmm boolean keeps trap of whether this is just a munmap()
unmapping operation (if zero) or a full address space teardown
(if non-zero).
2) In the full address space teardown case, and thus tlb->fullmm is
non-zero, the top level will do the explict flush_cache_mm()
(see mm/mmap.c:exit_mmap()), therefore the tlb_start_vma()
implementation need not do the flush, otherwise it does.
This is why sparc64 and friends implement it like this:
#define tlb_start_vma(tlb, vma) \
do { if (!(tlb)->fullmm) \
flush_cache_range(vma, vma->vm_start, vma->vm_end); \
} while (0)
Hope this clears things up.
Someone should probably take what I just wrote, expand and organize it,
then add such content to Documentation/cachetlb.txt
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [BUG] 2.6.1/MIPS - missing cache flushing when user program returns pages to kernel
2004-01-15 6:23 ` David S. Miller
@ 2004-01-15 6:23 ` David S. Miller
2004-01-15 18:03 ` Jun Sun
1 sibling, 0 replies; 8+ messages in thread
From: David S. Miller @ 2004-01-15 6:23 UTC (permalink / raw)
To: Jun Sun; +Cc: akpm, linux-mips, linux-kernel, rmk
On Wed, 14 Jan 2004 17:40:12 -0800
Jun Sun <jsun@mvista.com> wrote:
> Looking at my tree (which is from linux-mips.org), it appears
> arm, sparc, sparc64, and sh have tlb_start_vma() defined to call
> cache flushing.
Correct, in fact every platform where cache flushing matters
at all (ie. where flush_cache_*() routines actually need to
flush a cpu cache), they should have tlb_start_vma() do such
a flush.
> What exactly does tlb_start_vma()/tlb_end_vma() mean? There is
> only one invocation instance, which is significant enough to infer
> the meaning. :)
When the kernel unmaps a mmap region of a process (either for the
sake of munmap() or tearing down all mapping during exit()) tlb_start_vma()
is called, the page table mappings in the region are torn down one by
one, then a tlb_end_vma() call is made.
At the top level, ie. whoever invokes unmap_page_range(), there will
be a tlb_gather_mmu() call.
In order to properly optimize the cache flushes, most platforms do the
following:
1) The tlb->fullmm boolean keeps trap of whether this is just a munmap()
unmapping operation (if zero) or a full address space teardown
(if non-zero).
2) In the full address space teardown case, and thus tlb->fullmm is
non-zero, the top level will do the explict flush_cache_mm()
(see mm/mmap.c:exit_mmap()), therefore the tlb_start_vma()
implementation need not do the flush, otherwise it does.
This is why sparc64 and friends implement it like this:
#define tlb_start_vma(tlb, vma) \
do { if (!(tlb)->fullmm) \
flush_cache_range(vma, vma->vm_start, vma->vm_end); \
} while (0)
Hope this clears things up.
Someone should probably take what I just wrote, expand and organize it,
then add such content to Documentation/cachetlb.txt
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [BUG] 2.6.1/MIPS - missing cache flushing when user program returns pages to kernel
2004-01-15 6:23 ` David S. Miller
2004-01-15 6:23 ` David S. Miller
@ 2004-01-15 18:03 ` Jun Sun
1 sibling, 0 replies; 8+ messages in thread
From: Jun Sun @ 2004-01-15 18:03 UTC (permalink / raw)
To: David S. Miller; +Cc: linux-mips, jsun
[-- Attachment #1: Type: text/plain, Size: 2167 bytes --]
Looks the right fix is to add cache flush to tlb_start_vma().
See the patch attached. Unless someone objects, I will check it
later.
BTW, I really don't like the function naming of tlb_start_vma()
and tbl_end_vma(). :)
Jun
On Wed, Jan 14, 2004 at 10:23:16PM -0800, David S. Miller wrote:
> On Wed, 14 Jan 2004 17:40:12 -0800
> Jun Sun <jsun@mvista.com> wrote:
>
> > Looking at my tree (which is from linux-mips.org), it appears
> > arm, sparc, sparc64, and sh have tlb_start_vma() defined to call
> > cache flushing.
>
> Correct, in fact every platform where cache flushing matters
> at all (ie. where flush_cache_*() routines actually need to
> flush a cpu cache), they should have tlb_start_vma() do such
> a flush.
>
> > What exactly does tlb_start_vma()/tlb_end_vma() mean? There is
> > only one invocation instance, which is significant enough to infer
> > the meaning. :)
>
> When the kernel unmaps a mmap region of a process (either for the
> sake of munmap() or tearing down all mapping during exit()) tlb_start_vma()
> is called, the page table mappings in the region are torn down one by
> one, then a tlb_end_vma() call is made.
>
> At the top level, ie. whoever invokes unmap_page_range(), there will
> be a tlb_gather_mmu() call.
>
> In order to properly optimize the cache flushes, most platforms do the
> following:
>
> 1) The tlb->fullmm boolean keeps trap of whether this is just a munmap()
> unmapping operation (if zero) or a full address space teardown
> (if non-zero).
>
> 2) In the full address space teardown case, and thus tlb->fullmm is
> non-zero, the top level will do the explict flush_cache_mm()
> (see mm/mmap.c:exit_mmap()), therefore the tlb_start_vma()
> implementation need not do the flush, otherwise it does.
>
> This is why sparc64 and friends implement it like this:
>
> #define tlb_start_vma(tlb, vma) \
> do { if (!(tlb)->fullmm) \
> flush_cache_range(vma, vma->vm_start, vma->vm_end); \
> } while (0)
>
> Hope this clears things up.
>
> Someone should probably take what I just wrote, expand and organize it,
> then add such content to Documentation/cachetlb.txt
>
[-- Attachment #2: real-fix.patch --]
[-- Type: text/plain, Size: 751 bytes --]
diff -Nru linux/include/asm-mips/tlb.h.orig linux/include/asm-mips/tlb.h
--- linux/include/asm-mips/tlb.h.orig Thu Oct 31 08:35:52 2002
+++ linux/include/asm-mips/tlb.h Thu Jan 15 10:02:14 2004
@@ -2,9 +2,14 @@
#define __ASM_TLB_H
/*
- * MIPS doesn't need any special per-pte or per-vma handling..
+ * MIPS doesn't need any special per-pte or per-vma handling, except
+ * we need to flush cache for area to be unmapped.
*/
-#define tlb_start_vma(tlb, vma) do { } while (0)
+#define tlb_start_vma(tlb, vma) \
+ do { \
+ if (!tlb->fullmm) \
+ flush_cache_range(vma, vma->vm_start, vma->vm_end); \
+ } while (0)
#define tlb_end_vma(tlb, vma) do { } while (0)
#define __tlb_remove_tlb_entry(tlb, ptep, address) do { } while (0)
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2004-01-15 18:03 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-01-15 0:39 [BUG] 2.6.1/MIPS - missing cache flushing when user program returns pages to kernel Jun Sun
2004-01-15 1:12 ` Andrew Morton
2004-01-15 1:12 ` Andrew Morton
2004-01-15 1:29 ` Andrew Morton
2004-01-15 1:40 ` Jun Sun
2004-01-15 6:23 ` David S. Miller
2004-01-15 6:23 ` David S. Miller
2004-01-15 18:03 ` Jun Sun
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox