* problems on D-cache alias in 2.4.22
@ 2004-05-13 6:52 wuming
2004-05-14 7:45 ` Peter Horton
0 siblings, 1 reply; 19+ messages in thread
From: wuming @ 2004-05-13 6:52 UTC (permalink / raw)
To: linux-mips
I am developing in linux-2.4.22 on the machine with virtual address
indexed and physical
address tagged. But when I compile some application programs, I met the
following error:
cc1: internal compiler error: Segmentation fault
I have searched about this error from internet, it's due to some
hardware fault or
a wrong pte fault handler. Because my machine have D-cache aliasing, so
I think
this error should be due to a wrong pte fault handler. After my painful
kernel hacking,
I found some strange problems and it's in function __update_cache( ):
void __update_cache(struct vm_area_struct *vma, unsigned long address,
pte_t pte)
{
unsigned long addr;
struct page *page;
if (!cpu_has_dc_aliases)
return;
page = pte_page(pte);
/*This printk is added by myself*/
printk("<1>valid page:%d\tpage mapping:0x%p\tpage flags:%d\n",\
VALID_PAGE(page), page->mapping, (page->flags & (1UL << PG_dcache_dirty)));
if (VALID_PAGE(page) && page->mapping &&
(page->flags & (1UL << PG_dcache_dirty))) {
if (pages_do_alias((unsigned long) page_address(page), address &
PAGE_MASK)) {
addr = (unsigned long) page_address(page);
flush_data_cache_page(addr);
}
ClearPageDcacheDirty(page);
}
}
When my kernel is running, I found the condition "page->mapping" and
"(page->flags & (1UL << PG_dcache_dirty))"
will never be true at the same time. so the function
flush_data_cache_page( ) will never be called.
Then I commented the two condition, the compiler error disappeared.
I do not understand the phenomenon very clearly, so I need some help.
^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: problems on D-cache alias in 2.4.22
@ 2004-05-13 22:05 Bob Breuer
2004-05-13 22:05 ` Bob Breuer
2004-05-14 2:59 ` wuming
0 siblings, 2 replies; 19+ messages in thread
From: Bob Breuer @ 2004-05-13 22:05 UTC (permalink / raw)
To: linux-mips
> -----Original Message-----
> Date: Thu, 13 May 2004 14:52:53 +0800
> From: wuming <wuming@ict.ac.cn>
> Subject: problems on D-cache alias in 2.4.22
>
> I am developing in linux-2.4.22 on the machine with virtual address
> indexed and physical
> address tagged. But when I compile some application programs,
> I met the
> following error:
>
> cc1: internal compiler error: Segmentation fault
>
> I have searched about this error from internet, it's due to some
> hardware fault or
> a wrong pte fault handler. Because my machine have D-cache
> aliasing, so
> I think
> this error should be due to a wrong pte fault handler. After
> my painful
> kernel hacking,
> I found some strange problems and it's in function __update_cache( ):
>
> void __update_cache(struct vm_area_struct *vma, unsigned long address,
> pte_t pte)
> {
> unsigned long addr;
> struct page *page;
>
> if (!cpu_has_dc_aliases)
> return;
>
> page = pte_page(pte);
>
> /*This printk is added by myself*/
> printk("<1>valid page:%d\tpage mapping:0x%p\tpage flags:%d\n",\
> VALID_PAGE(page), page->mapping, (page->flags & (1UL <<
> PG_dcache_dirty)));
>
> if (VALID_PAGE(page) && page->mapping &&
> (page->flags & (1UL << PG_dcache_dirty))) {
> if (pages_do_alias((unsigned long) page_address(page), address &
> PAGE_MASK)) {
> addr = (unsigned long) page_address(page);
> flush_data_cache_page(addr);
> }
> ClearPageDcacheDirty(page);
> }
> }
>
> When my kernel is running, I found the condition "page->mapping" and
> "(page->flags & (1UL << PG_dcache_dirty))"
> will never be true at the same time. so the function
> flush_data_cache_page( ) will never be called.
> Then I commented the two condition, the compiler error disappeared.
> I do not understand the phenomenon very clearly, so I need some help.
>
>
I am having a similar problem with 2.4.26 on an NEC VR5500 with a 32k
2-way cache. This is with a 32 bit little-endian kernel, and an ext2
filesystem on an ide hard drive in pio mode.
Removing just the check for PG_dcache_dirty fixes the problem for me.
Along the way, I found a bogus check for cache aliases in c-r4k.c. In
the ld_mmu_r4xx0 function, it has the check:
if (c->dcache.sets * c->dcache.ways > PAGE_SIZE)
which will never work for a 32k cache.
Bob Breuer
^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: problems on D-cache alias in 2.4.22
2004-05-13 22:05 Bob Breuer
@ 2004-05-13 22:05 ` Bob Breuer
2004-05-14 2:59 ` wuming
1 sibling, 0 replies; 19+ messages in thread
From: Bob Breuer @ 2004-05-13 22:05 UTC (permalink / raw)
To: linux-mips
> -----Original Message-----
> Date: Thu, 13 May 2004 14:52:53 +0800
> From: wuming <wuming@ict.ac.cn>
> Subject: problems on D-cache alias in 2.4.22
>
> I am developing in linux-2.4.22 on the machine with virtual address
> indexed and physical
> address tagged. But when I compile some application programs,
> I met the
> following error:
>
> cc1: internal compiler error: Segmentation fault
>
> I have searched about this error from internet, it's due to some
> hardware fault or
> a wrong pte fault handler. Because my machine have D-cache
> aliasing, so
> I think
> this error should be due to a wrong pte fault handler. After
> my painful
> kernel hacking,
> I found some strange problems and it's in function __update_cache( ):
>
> void __update_cache(struct vm_area_struct *vma, unsigned long address,
> pte_t pte)
> {
> unsigned long addr;
> struct page *page;
>
> if (!cpu_has_dc_aliases)
> return;
>
> page = pte_page(pte);
>
> /*This printk is added by myself*/
> printk("<1>valid page:%d\tpage mapping:0x%p\tpage flags:%d\n",\
> VALID_PAGE(page), page->mapping, (page->flags & (1UL <<
> PG_dcache_dirty)));
>
> if (VALID_PAGE(page) && page->mapping &&
> (page->flags & (1UL << PG_dcache_dirty))) {
> if (pages_do_alias((unsigned long) page_address(page), address &
> PAGE_MASK)) {
> addr = (unsigned long) page_address(page);
> flush_data_cache_page(addr);
> }
> ClearPageDcacheDirty(page);
> }
> }
>
> When my kernel is running, I found the condition "page->mapping" and
> "(page->flags & (1UL << PG_dcache_dirty))"
> will never be true at the same time. so the function
> flush_data_cache_page( ) will never be called.
> Then I commented the two condition, the compiler error disappeared.
> I do not understand the phenomenon very clearly, so I need some help.
>
>
I am having a similar problem with 2.4.26 on an NEC VR5500 with a 32k
2-way cache. This is with a 32 bit little-endian kernel, and an ext2
filesystem on an ide hard drive in pio mode.
Removing just the check for PG_dcache_dirty fixes the problem for me.
Along the way, I found a bogus check for cache aliases in c-r4k.c. In
the ld_mmu_r4xx0 function, it has the check:
if (c->dcache.sets * c->dcache.ways > PAGE_SIZE)
which will never work for a 32k cache.
Bob Breuer
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: problems on D-cache alias in 2.4.22
2004-05-13 22:05 Bob Breuer
2004-05-13 22:05 ` Bob Breuer
@ 2004-05-14 2:59 ` wuming
2004-05-14 7:52 ` Peter Horton
2004-05-15 0:12 ` Ralf Baechle
1 sibling, 2 replies; 19+ messages in thread
From: wuming @ 2004-05-14 2:59 UTC (permalink / raw)
To: linux-mips
>
>
>I am having a similar problem with 2.4.26 on an NEC VR5500 with a 32k
>2-way cache. This is with a 32 bit little-endian kernel, and an ext2
>filesystem on an ide hard drive in pio mode.
>
>Removing just the check for PG_dcache_dirty fixes the problem for me.
>
>Along the way, I found a bogus check for cache aliases in c-r4k.c. In
>the ld_mmu_r4xx0 function, it has the check:
> if (c->dcache.sets * c->dcache.ways > PAGE_SIZE)
>which will never work for a 32k cache.
>
>Bob Breuer
>
>
>
>
>
I have understood the phenomenon, and I think this is a kernel's bug.
The real wrong place is not the judgement for condition "PG_dcache_dirty"
in function __update_cache( ).
in file mm/filemap.c and function filemap_nopage( ):
......
success:
/*
* Try read-ahead for sequential areas.
*/
if (VM_SequentialReadHint(area))
nopage_sequential_readahead(area, pgoff, size);
/*
* Found the page and have a reference on it, need to check sharing
* and possibly copy it over to another page..
*/
mark_page_accessed(page);
flush_page_to_ram(page);
return page;
......
flush_page_to_ram( ) has not been used for a long time, and in kernel 2.4.22
"include/asm-mips/cacheflush.h"
#define flush_page_to_ram(page) do { } while (0)
so the mapped page has not been flushed to ram, and the user space will not
know the latest data in the page.
the flush_page_to_ram( ) should be replaced by flush_dcache_page( ),
and if the flush_dcache_page( ) does not really flush the cache, it will set
the PG_dcache_dirty, and the real flush will be postponed to
__update_cache( ).
and if there is not the flush_dcache_page( ) here, no one will set the
PG_dcache_dirty,
and __update_cache( ) will not flush the page too, so the D-cache
aliasing happens.
at last, when I replaced flush_page_to_ram( ) with flush_dcache_page( ),
the internal compiler error disappeared.
I hope your problem will be solved by this way too. God bless you! :-)
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: problems on D-cache alias in 2.4.22
2004-05-13 6:52 problems on D-cache alias in 2.4.22 wuming
@ 2004-05-14 7:45 ` Peter Horton
0 siblings, 0 replies; 19+ messages in thread
From: Peter Horton @ 2004-05-14 7:45 UTC (permalink / raw)
To: wuming; +Cc: linux-mips
wuming wrote:
> I am developing in linux-2.4.22 on the machine with virtual address
>indexed and physical
>address tagged. But when I compile some application programs, I met the
>following error:
>
>cc1: internal compiler error: Segmentation fault
>
>
>
Is this on an IDE drive ?
PIO or DMA ?
P.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: problems on D-cache alias in 2.4.22
2004-05-14 2:59 ` wuming
@ 2004-05-14 7:52 ` Peter Horton
2004-05-15 0:12 ` Ralf Baechle
1 sibling, 0 replies; 19+ messages in thread
From: Peter Horton @ 2004-05-14 7:52 UTC (permalink / raw)
To: wuming; +Cc: linux-mips
wuming wrote:
> I have understood the phenomenon, and I think this is a kernel's bug.
> The real wrong place is not the judgement for condition "PG_dcache_dirty"
> in function __update_cache( ).
> in file mm/filemap.c and function filemap_nopage( ):
> ......
> success:
> /*
> * Try read-ahead for sequential areas.
> */
> if (VM_SequentialReadHint(area))
> nopage_sequential_readahead(area, pgoff, size);
>
> /*
> * Found the page and have a reference on it, need to check
> sharing
> * and possibly copy it over to another page..
> */
> mark_page_accessed(page);
> flush_page_to_ram(page);
> return page;
> ......
>
> flush_page_to_ram( ) has not been used for a long time, and in kernel
> 2.4.22
> "include/asm-mips/cacheflush.h"
> #define flush_page_to_ram(page) do { } while (0)
>
> so the mapped page has not been flushed to ram, and the user space
> will not
> know the latest data in the page.
> the flush_page_to_ram( ) should be replaced by flush_dcache_page( ),
> and if the flush_dcache_page( ) does not really flush the cache, it
> will set
> the PG_dcache_dirty, and the real flush will be postponed to
> __update_cache( ).
> and if there is not the flush_dcache_page( ) here, no one will set the
> PG_dcache_dirty,
> and __update_cache( ) will not flush the page too, so the D-cache
> aliasing happens.
>
> at last, when I replaced flush_page_to_ram( ) with flush_dcache_page( ),
> the internal compiler error disappeared.
>
> I hope your problem will be solved by this way too. God bless you! :-)
>
This is probably just hiding your problem. flush_page_to_ram() is not
used anymore.
P.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: problems on D-cache alias in 2.4.22
2004-05-14 2:59 ` wuming
2004-05-14 7:52 ` Peter Horton
@ 2004-05-15 0:12 ` Ralf Baechle
2004-05-15 12:31 ` Fuxin Zhang
1 sibling, 1 reply; 19+ messages in thread
From: Ralf Baechle @ 2004-05-15 0:12 UTC (permalink / raw)
To: wuming; +Cc: linux-mips
Wuming,
what kind of filesystem or storage are you using?
Ralf
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: problems on D-cache alias in 2.4.22
2004-05-15 0:12 ` Ralf Baechle
@ 2004-05-15 12:31 ` Fuxin Zhang
0 siblings, 0 replies; 19+ messages in thread
From: Fuxin Zhang @ 2004-05-15 12:31 UTC (permalink / raw)
To: Ralf Baechle; +Cc: wuming, linux-mips
We are using ide disk with ext3 filesystem, DMA is on( PIIX4 chip),but
it seems PIO/DMA
does not affect the failures.
Ralf Baechle wrote:
>Wuming,
>
>what kind of filesystem or storage are you using?
>
> Ralf
>
>
>
>
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: problems on D-cache alias in 2.4.22
@ 2004-05-18 18:17 Bob Breuer
2004-05-18 18:17 ` Bob Breuer
` (3 more replies)
0 siblings, 4 replies; 19+ messages in thread
From: Bob Breuer @ 2004-05-18 18:17 UTC (permalink / raw)
To: linux-mips
[-- Attachment #1: Type: text/plain, Size: 1393 bytes --]
> -----Original Message-----
> From: linux-mips-bounce@linux-mips.org
> [mailto:linux-mips-bounce@linux-mips.org]On Behalf Of Peter Horton
> Sent: Friday, May 14, 2004 2:53 AM
> To: wuming
> Cc: linux-mips@linux-mips.org
> Subject: Re: problems on D-cache alias in 2.4.22
>
>
> wuming wrote:
>
...
> > at last, when I replaced flush_page_to_ram( ) with
> flush_dcache_page( ),
> > the internal compiler error disappeared.
> >
...
>
> This is probably just hiding your problem. flush_page_to_ram() is not
> used anymore.
>
> P.
>
>
Changing that same place also fixes my problem. However, I came across
the mips cobalt patches and after applying a variation of the IDE cache
fix from there, that also fixes the problem. So it would seem that this
is the same problem as already fixed in the cobalt patch, but showing up
on non-cobalt hardware.
flush_page_to_ram() was made useless around the release of 2.4.21. I
suspect that this was broken at that time, seeing how it is broken in
2.4.22 and 2.4.26. From browsing the debian-mips mailing list archives,
it appears that they have not had a stable mips kernel since 2.4.19,
could this bug be the cause? Are the recent Debian mips kernels still
unstable?
Would anyone with an unstable 2.4.2x kernel be willing to try one of the
attached patches to see if the situation improves?
Bob
[-- Attachment #2: cache_alias_fix1.diff --]
[-- Type: application/octet-stream, Size: 480 bytes --]
Index: mm/filemap.c
===================================================================
RCS file: /home/cvs/linux/mm/filemap.c,v
retrieving revision 1.74.2.14
diff -u -r1.74.2.14 filemap.c
--- mm/filemap.c 20 Feb 2004 01:22:21 -0000 1.74.2.14
+++ mm/filemap.c 18 May 2004 17:18:26 -0000
@@ -2111,7 +2111,7 @@
* and possibly copy it over to another page..
*/
mark_page_accessed(page);
- flush_page_to_ram(page);
+ flush_dcache_page(page);
return page;
no_cached_page:
[-- Attachment #3: cache_alias_fix2.diff --]
[-- Type: application/octet-stream, Size: 1543 bytes --]
Index: include/asm-mips/ide.h
===================================================================
RCS file: /home/cvs/linux/include/asm-mips/ide.h,v
retrieving revision 1.11.2.7
diff -u -r1.11.2.7 ide.h
--- include/asm-mips/ide.h 15 Jul 2003 15:08:33 -0000 1.11.2.7
+++ include/asm-mips/ide.h 18 May 2004 17:57:44 -0000
@@ -69,6 +69,49 @@
#endif
#include <asm-generic/ide_iops.h>
+#include <asm/r4kcache.h>
+
+static inline void __flush_dcache_range(unsigned long start, unsigned long end)
+{
+ unsigned long dc_size, dc_line, addr;
+
+ dc_size = current_cpu_data.dcache.waysize;
+ dc_line = current_cpu_data.dcache.linesz;
+
+ addr = start & ~(dc_line - 1);
+ end += dc_line - 1;
+
+ if (end - addr < dc_size)
+ for (; addr < end; addr += dc_line)
+ flush_dcache_line(addr);
+ else {
+ /* flush all of dcache */
+ addr = KSEG0;
+ end = addr + dc_size;
+ for (; addr < end; addr += dc_line)
+ flush_dcache_line_indexed(addr);
+ }
+}
+
+#undef insw
+#undef insl
+#undef __ide_insw
+#undef __ide_insl
+
+static inline void __ide_insw(unsigned long port, void *addr, u32 count)
+{
+ __insw(port, addr, count);
+ __flush_dcache_range((unsigned long)addr, (unsigned long)addr + count*2);
+}
+
+static inline void __ide_insl(unsigned long port, void *addr, u32 count)
+{
+ __insl(port, addr, count);
+ __flush_dcache_range((unsigned long)addr, (unsigned long)addr + count*4);
+}
+
+#define insw(port, addr, count) __ide_insw(port, addr, count)
+#define insl(port, addr, count) __ide_insl(port, addr, count)
#endif /* __KERNEL__ */
^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: problems on D-cache alias in 2.4.22
2004-05-18 18:17 Bob Breuer
@ 2004-05-18 18:17 ` Bob Breuer
2004-05-18 18:45 ` Jun Sun
` (2 subsequent siblings)
3 siblings, 0 replies; 19+ messages in thread
From: Bob Breuer @ 2004-05-18 18:17 UTC (permalink / raw)
To: linux-mips
[-- Attachment #1: Type: text/plain, Size: 1393 bytes --]
> -----Original Message-----
> From: linux-mips-bounce@linux-mips.org
> [mailto:linux-mips-bounce@linux-mips.org]On Behalf Of Peter Horton
> Sent: Friday, May 14, 2004 2:53 AM
> To: wuming
> Cc: linux-mips@linux-mips.org
> Subject: Re: problems on D-cache alias in 2.4.22
>
>
> wuming wrote:
>
...
> > at last, when I replaced flush_page_to_ram( ) with
> flush_dcache_page( ),
> > the internal compiler error disappeared.
> >
...
>
> This is probably just hiding your problem. flush_page_to_ram() is not
> used anymore.
>
> P.
>
>
Changing that same place also fixes my problem. However, I came across
the mips cobalt patches and after applying a variation of the IDE cache
fix from there, that also fixes the problem. So it would seem that this
is the same problem as already fixed in the cobalt patch, but showing up
on non-cobalt hardware.
flush_page_to_ram() was made useless around the release of 2.4.21. I
suspect that this was broken at that time, seeing how it is broken in
2.4.22 and 2.4.26. From browsing the debian-mips mailing list archives,
it appears that they have not had a stable mips kernel since 2.4.19,
could this bug be the cause? Are the recent Debian mips kernels still
unstable?
Would anyone with an unstable 2.4.2x kernel be willing to try one of the
attached patches to see if the situation improves?
Bob
[-- Attachment #2: cache_alias_fix1.diff --]
[-- Type: application/octet-stream, Size: 480 bytes --]
Index: mm/filemap.c
===================================================================
RCS file: /home/cvs/linux/mm/filemap.c,v
retrieving revision 1.74.2.14
diff -u -r1.74.2.14 filemap.c
--- mm/filemap.c 20 Feb 2004 01:22:21 -0000 1.74.2.14
+++ mm/filemap.c 18 May 2004 17:18:26 -0000
@@ -2111,7 +2111,7 @@
* and possibly copy it over to another page..
*/
mark_page_accessed(page);
- flush_page_to_ram(page);
+ flush_dcache_page(page);
return page;
no_cached_page:
[-- Attachment #3: cache_alias_fix2.diff --]
[-- Type: application/octet-stream, Size: 1543 bytes --]
Index: include/asm-mips/ide.h
===================================================================
RCS file: /home/cvs/linux/include/asm-mips/ide.h,v
retrieving revision 1.11.2.7
diff -u -r1.11.2.7 ide.h
--- include/asm-mips/ide.h 15 Jul 2003 15:08:33 -0000 1.11.2.7
+++ include/asm-mips/ide.h 18 May 2004 17:57:44 -0000
@@ -69,6 +69,49 @@
#endif
#include <asm-generic/ide_iops.h>
+#include <asm/r4kcache.h>
+
+static inline void __flush_dcache_range(unsigned long start, unsigned long end)
+{
+ unsigned long dc_size, dc_line, addr;
+
+ dc_size = current_cpu_data.dcache.waysize;
+ dc_line = current_cpu_data.dcache.linesz;
+
+ addr = start & ~(dc_line - 1);
+ end += dc_line - 1;
+
+ if (end - addr < dc_size)
+ for (; addr < end; addr += dc_line)
+ flush_dcache_line(addr);
+ else {
+ /* flush all of dcache */
+ addr = KSEG0;
+ end = addr + dc_size;
+ for (; addr < end; addr += dc_line)
+ flush_dcache_line_indexed(addr);
+ }
+}
+
+#undef insw
+#undef insl
+#undef __ide_insw
+#undef __ide_insl
+
+static inline void __ide_insw(unsigned long port, void *addr, u32 count)
+{
+ __insw(port, addr, count);
+ __flush_dcache_range((unsigned long)addr, (unsigned long)addr + count*2);
+}
+
+static inline void __ide_insl(unsigned long port, void *addr, u32 count)
+{
+ __insl(port, addr, count);
+ __flush_dcache_range((unsigned long)addr, (unsigned long)addr + count*4);
+}
+
+#define insw(port, addr, count) __ide_insw(port, addr, count)
+#define insl(port, addr, count) __ide_insl(port, addr, count)
#endif /* __KERNEL__ */
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: problems on D-cache alias in 2.4.22
2004-05-18 19:50 ` Ralf Baechle
@ 2004-05-18 18:24 ` Alan Cox
2004-05-18 21:21 ` Peter Horton
1 sibling, 0 replies; 19+ messages in thread
From: Alan Cox @ 2004-05-18 18:24 UTC (permalink / raw)
To: Ralf Baechle; +Cc: Peter Horton, Jun Sun, Bob Breuer, linux-mips
On Maw, 2004-05-18 at 20:50, Ralf Baechle wrote:
> Carelessly written PIO drivers on any architecture would suffer from this
> kind of problem.
It isnt a driver problem. (Ramdisk is a bit special as you get aliases
in funny ways). The block and page cache is responsible for sorting this
out.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: problems on D-cache alias in 2.4.22
2004-05-18 18:17 Bob Breuer
2004-05-18 18:17 ` Bob Breuer
@ 2004-05-18 18:45 ` Jun Sun
2004-05-18 19:10 ` Peter Horton
2004-05-18 20:02 ` Ralf Baechle
2004-05-18 20:10 ` Thiemo Seufer
3 siblings, 1 reply; 19+ messages in thread
From: Jun Sun @ 2004-05-18 18:45 UTC (permalink / raw)
To: Bob Breuer; +Cc: linux-mips, jsun
On Tue, May 18, 2004 at 01:17:38PM -0500, Bob Breuer wrote:
>
> > -----Original Message-----
> > From: linux-mips-bounce@linux-mips.org
> > [mailto:linux-mips-bounce@linux-mips.org]On Behalf Of Peter Horton
> > Sent: Friday, May 14, 2004 2:53 AM
> > To: wuming
> > Cc: linux-mips@linux-mips.org
> > Subject: Re: problems on D-cache alias in 2.4.22
> >
> >
> > wuming wrote:
> >
> ...
> > > at last, when I replaced flush_page_to_ram( ) with
> > flush_dcache_page( ),
> > > the internal compiler error disappeared.
> > >
> ...
> >
> > This is probably just hiding your problem. flush_page_to_ram() is not
> > used anymore.
> >
> > P.
> >
> >
>
> Changing that same place also fixes my problem.
<snip>
Like others suggested, this is not the right fix. flush_page_to_ram()
is correctly nullified. Its job should be done somewhere else
by other routines.
Here are a couple of random ideas for finding the true root cause:
. If a page is shared by multiple user processes, make sure either the CPU
does not have d-cache alaising problem (i.e., cache way size is 4KB or less)
or their virtual addresses lie on the "same color strip" of the d-cache.
In other words, they would be cached in the same cache way.
. If a page is modified by kernel and accessed by user land, make sure a
flush_dcache_page() is called right after the modifying.
. If a page is modified by userland and accessed by kernel, I _think_ currently
kernel would still do a flush_dcache_page() call. However, this won't
work on MIPS because the cache at user virtual addresses are not flushed.
Either try to flush with user virtual address, or do a flush_cache_all(). *ick*
BTW, I _think_ the last problem stilled exists in 2.6. We probably need
to use the reverse maping info to fix it.
Jun
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: problems on D-cache alias in 2.4.22
2004-05-18 18:45 ` Jun Sun
@ 2004-05-18 19:10 ` Peter Horton
2004-05-18 19:50 ` Ralf Baechle
2004-05-18 22:25 ` Jun Sun
0 siblings, 2 replies; 19+ messages in thread
From: Peter Horton @ 2004-05-18 19:10 UTC (permalink / raw)
To: Jun Sun; +Cc: Bob Breuer, linux-mips
On Tue, May 18, 2004 at 11:45:19AM -0700, Jun Sun wrote:
>
> Like others suggested, this is not the right fix. flush_page_to_ram()
> is correctly nullified. Its job should be done somewhere else
> by other routines.
>
> Here are a couple of random ideas for finding the true root cause:
>
We know what the true root cause is :-)
IDE PIO fills the D-cache with the read data (write allocate) as it
copies it to the page cache.
The kernel maps the page cache page into user space ... BANG! possible
D-cache alias.
The kernel doesn't bother flushing the page cache page from the D-cache
as it's never accessed at it's page cache address.
The current fix in the Cobalt patches (2.4 & 2.6) just flushes the read
data out of the D-cache after every IDE insw()/insl(). This is the least
intrusive fix.
Some Sparc machines also see this problem.
P.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: problems on D-cache alias in 2.4.22
2004-05-18 19:10 ` Peter Horton
@ 2004-05-18 19:50 ` Ralf Baechle
2004-05-18 18:24 ` Alan Cox
2004-05-18 21:21 ` Peter Horton
2004-05-18 22:25 ` Jun Sun
1 sibling, 2 replies; 19+ messages in thread
From: Ralf Baechle @ 2004-05-18 19:50 UTC (permalink / raw)
To: Peter Horton; +Cc: Jun Sun, Bob Breuer, linux-mips
On Tue, May 18, 2004 at 08:10:19PM +0100, Peter Horton wrote:
> The kernel maps the page cache page into user space ... BANG! possible
> D-cache alias.
>
> The kernel doesn't bother flushing the page cache page from the D-cache
> as it's never accessed at it's page cache address.
It is - after all the driver is copying the data to there. The same
problem also exists in the ramdisk driver and there it has been fixed
properly, it seems.
> The current fix in the Cobalt patches (2.4 & 2.6) just flushes the read
> data out of the D-cache after every IDE insw()/insl(). This is the least
> intrusive fix.
>
> Some Sparc machines also see this problem.
Carelessly written PIO drivers on any architecture would suffer from this
kind of problem.
Ralf
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: problems on D-cache alias in 2.4.22
2004-05-18 18:17 Bob Breuer
2004-05-18 18:17 ` Bob Breuer
2004-05-18 18:45 ` Jun Sun
@ 2004-05-18 20:02 ` Ralf Baechle
2004-05-18 20:10 ` Thiemo Seufer
3 siblings, 0 replies; 19+ messages in thread
From: Ralf Baechle @ 2004-05-18 20:02 UTC (permalink / raw)
To: Bob Breuer; +Cc: linux-mips
On Tue, May 18, 2004 at 01:17:38PM -0500, Bob Breuer wrote:
> Changing that same place also fixes my problem. However, I came across
> the mips cobalt patches and after applying a variation of the IDE cache
> fix from there, that also fixes the problem. So it would seem that this
> is the same problem as already fixed in the cobalt patch, but showing up
> on non-cobalt hardware.
>
> flush_page_to_ram() was made useless around the release of 2.4.21. I
> suspect that this was broken at that time, seeing how it is broken in
> 2.4.22 and 2.4.26. From browsing the debian-mips mailing list archives,
> it appears that they have not had a stable mips kernel since 2.4.19,
> could this bug be the cause? Are the recent Debian mips kernels still
> unstable?
>
> Would anyone with an unstable 2.4.2x kernel be willing to try one of the
> attached patches to see if the situation improves?
flush_page_to_ram has been deprecated since a long, long time and so it's
use in memory managment was no longer correct for all cases - MIPS was
basically the last major Linux architecture left using it. Replacing
it with flush_dcache_page fixed those correctness problem and delivered
a major speedup. So no sense in whining - flush_page_to_ram won't return.
Ralf
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: problems on D-cache alias in 2.4.22
2004-05-18 18:17 Bob Breuer
` (2 preceding siblings ...)
2004-05-18 20:02 ` Ralf Baechle
@ 2004-05-18 20:10 ` Thiemo Seufer
3 siblings, 0 replies; 19+ messages in thread
From: Thiemo Seufer @ 2004-05-18 20:10 UTC (permalink / raw)
To: linux-mips
Bob Breuer wrote:
[snip]
> From browsing the debian-mips mailing list archives,
> it appears that they have not had a stable mips kernel since 2.4.19,
> could this bug be the cause? Are the recent Debian mips kernels still
> unstable?
The latest 2.4.25/2.4.26 (Debian-)Kernels seem to work well on all
Debian-supported machines. They include some cobalt patch which
isn't in the linux-mips.org CVS (yet).
Thiemo
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: problems on D-cache alias in 2.4.22
2004-05-18 19:50 ` Ralf Baechle
2004-05-18 18:24 ` Alan Cox
@ 2004-05-18 21:21 ` Peter Horton
1 sibling, 0 replies; 19+ messages in thread
From: Peter Horton @ 2004-05-18 21:21 UTC (permalink / raw)
To: Ralf Baechle; +Cc: Peter Horton, Jun Sun, Bob Breuer, linux-mips
On Tue, May 18, 2004 at 09:50:55PM +0200, Ralf Baechle wrote:
> On Tue, May 18, 2004 at 08:10:19PM +0100, Peter Horton wrote:
>
> > The kernel maps the page cache page into user space ... BANG! possible
> > D-cache alias.
> >
> > The kernel doesn't bother flushing the page cache page from the D-cache
> > as it's never accessed at it's page cache address.
>
> It is - after all the driver is copying the data to there. The same
> problem also exists in the ramdisk driver and there it has been fixed
> properly, it seems.
>
I had a dig around but couldn't decide where the proper fix should go.
> > The current fix in the Cobalt patches (2.4 & 2.6) just flushes the read
> > data out of the D-cache after every IDE insw()/insl(). This is the least
> > intrusive fix.
> >
> > Some Sparc machines also see this problem.
>
> Carelessly written PIO drivers on any architecture would suffer from this
> kind of problem.
>
Are you saying it's a driver problem ? :-)
P.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: problems on D-cache alias in 2.4.22
2004-05-18 19:10 ` Peter Horton
2004-05-18 19:50 ` Ralf Baechle
@ 2004-05-18 22:25 ` Jun Sun
2004-05-18 23:29 ` Peter Horton
1 sibling, 1 reply; 19+ messages in thread
From: Jun Sun @ 2004-05-18 22:25 UTC (permalink / raw)
To: Peter Horton; +Cc: Bob Breuer, linux-mips, jsun
On Tue, May 18, 2004 at 08:10:19PM +0100, Peter Horton wrote:
> On Tue, May 18, 2004 at 11:45:19AM -0700, Jun Sun wrote:
> >
> > Like others suggested, this is not the right fix. flush_page_to_ram()
> > is correctly nullified. Its job should be done somewhere else
> > by other routines.
> >
> > Here are a couple of random ideas for finding the true root cause:
> >
>
> We know what the true root cause is :-)
>
> IDE PIO fills the D-cache with the read data (write allocate) as it
> copies it to the page cache.
>
> The kernel maps the page cache page into user space ... BANG! possible
> D-cache alias.
>
> The kernel doesn't bother flushing the page cache page from the D-cache
> as it's never accessed at it's page cache address.
>
The kernel (or driver) should flush the page if it is mapped to user space
and the content is modified.
> The current fix in the Cobalt patches (2.4 & 2.6) just flushes the read
> data out of the D-cache after every IDE insw()/insl(). This is the least
> intrusive fix.
>
It should be fixed at a higher layer before we return back to userland.
If you can illustrate the call stack, I can probably take a look and
give my opinion.
Jun
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: problems on D-cache alias in 2.4.22
2004-05-18 22:25 ` Jun Sun
@ 2004-05-18 23:29 ` Peter Horton
0 siblings, 0 replies; 19+ messages in thread
From: Peter Horton @ 2004-05-18 23:29 UTC (permalink / raw)
To: Jun Sun; +Cc: linux-mips
On Tue, May 18, 2004 at 03:25:39PM -0700, Jun Sun wrote:
> >
> > IDE PIO fills the D-cache with the read data (write allocate) as it
> > copies it to the page cache.
> >
> > The kernel maps the page cache page into user space ... BANG! possible
> > D-cache alias.
> >
> > The kernel doesn't bother flushing the page cache page from the D-cache
> > as it's never accessed at it's page cache address.
> >
>
> The kernel (or driver) should flush the page if it is mapped to user space
> and the content is modified.
>
We just need a hook so that we can flush a page from the D-cache once
it's read from a block device into the page cache.
> > The current fix in the Cobalt patches (2.4 & 2.6) just flushes the read
> > data out of the D-cache after every IDE insw()/insl(). This is the least
> > intrusive fix.
> >
>
> It should be fixed at a higher layer before we return back to userland.
>
> If you can illustrate the call stack, I can probably take a look and
> give my opinion.
>
No call stack, sorry. It was months ago that I debugged this.
IIRC I picked up the aliases with memcmp() in do_no_page() in
mm/memory.c.
P.
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2004-05-18 23:29 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-05-13 6:52 problems on D-cache alias in 2.4.22 wuming
2004-05-14 7:45 ` Peter Horton
-- strict thread matches above, loose matches on Subject: below --
2004-05-13 22:05 Bob Breuer
2004-05-13 22:05 ` Bob Breuer
2004-05-14 2:59 ` wuming
2004-05-14 7:52 ` Peter Horton
2004-05-15 0:12 ` Ralf Baechle
2004-05-15 12:31 ` Fuxin Zhang
2004-05-18 18:17 Bob Breuer
2004-05-18 18:17 ` Bob Breuer
2004-05-18 18:45 ` Jun Sun
2004-05-18 19:10 ` Peter Horton
2004-05-18 19:50 ` Ralf Baechle
2004-05-18 18:24 ` Alan Cox
2004-05-18 21:21 ` Peter Horton
2004-05-18 22:25 ` Jun Sun
2004-05-18 23:29 ` Peter Horton
2004-05-18 20:02 ` Ralf Baechle
2004-05-18 20:10 ` Thiemo Seufer
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox