* Re: Kernel 2.4.23 on Cobalt Qube2 - area of problem
@ 2003-12-12 10:45 Peter Horton
2003-12-13 17:07 ` Ralf Baechle
0 siblings, 1 reply; 4+ messages in thread
From: Peter Horton @ 2003-12-12 10:45 UTC (permalink / raw)
To: linux-mips
[-- Attachment #1: Type: text/plain, Size: 658 bytes --]
More info on the random segmentation faults and data corruption on my Qube2.
2.4.21 from CVS is the first kernel to exhibit the problem. I tracked it
down to the cache handling changes that went in between 2.4.20 and 2.4.21.
By (not very scientifically) removing flush_dcache_page() and
re-instating flush_page_to_ram() I managed to get the 2.4.21 kernel
stable (see attached patch). Applying a similiar patch to 2.4.23 (CVS
HEAD) allows me to run 2.4.23 too.
I don't know how to track the problem any further - the kernel's cache
handling is a bit out of my league.
Anyone got a clue stick they can point me in the right direction with ?
P.
[-- Attachment #2: cobalt-patch --]
[-- Type: text/plain, Size: 3261 bytes --]
diff -urN linux-2.4.21-xxx/arch/mips/mm/c-r4k.c linux-2.4.21/arch/mips/mm/c-r4k.c
--- linux-2.4.21-xxx/arch/mips/mm/c-r4k.c Thu Dec 11 20:20:28 2003
+++ linux-2.4.21/arch/mips/mm/c-r4k.c Thu Dec 11 10:02:54 2003
@@ -1037,16 +1037,6 @@
return 1;
}
-static void r4k_flush_page_to_ram_d16(struct page *page)
-{
- blast_dcache16_page((unsigned long)page_address(page));
-}
-
-static void r4k_flush_page_to_ram_d32(struct page *page)
-{
- blast_dcache32_page((unsigned long)page_address(page));
-}
-
static void __init setup_noscache_funcs(void)
{
unsigned int prid;
@@ -1059,7 +1049,6 @@
_clear_page = r4k_clear_page32_d16;
_copy_page = r4k_copy_page_d16;
- _flush_page_to_ram = r4k_flush_page_to_ram_d16;
break;
case 32:
prid = read_c0_prid() & 0xfff0;
@@ -1076,8 +1065,6 @@
_clear_page = r4k_clear_page32_d32;
_copy_page = r4k_copy_page_d32;
}
-
- _flush_page_to_ram = r4k_flush_page_to_ram_d32;
break;
}
}
diff -urN linux-2.4.21-xxx/arch/mips/mm/cache.c linux-2.4.21/arch/mips/mm/cache.c
--- linux-2.4.21-xxx/arch/mips/mm/cache.c Thu Dec 11 20:15:37 2003
+++ linux-2.4.21/arch/mips/mm/cache.c Fri Jul 18 15:16:06 2003
@@ -20,8 +20,6 @@
return 0;
}
-#if 0
-
void flush_dcache_page(struct page *page)
{
unsigned long addr;
@@ -67,5 +65,3 @@
}
EXPORT_SYMBOL(flush_dcache_page);
-
-#endif
diff -urN linux-2.4.21-xxx/arch/mips/mm/loadmmu.c linux-2.4.21/arch/mips/mm/loadmmu.c
--- linux-2.4.21-xxx/arch/mips/mm/loadmmu.c Thu Dec 11 20:25:23 2003
+++ linux-2.4.21/arch/mips/mm/loadmmu.c Thu Dec 11 10:02:54 2003
@@ -40,8 +40,6 @@
void (*_flush_data_cache_page)(unsigned long addr);
void (*_flush_icache_all)(void);
-void (*_flush_page_to_ram)(struct page * page);
-
#ifdef CONFIG_NONCOHERENT_IO
/* DMA cache operations. */
diff -urN linux-2.4.21-xxx/include/asm-mips/cacheflush.h linux-2.4.21/include/asm-mips/cacheflush.h
--- linux-2.4.21-xxx/include/asm-mips/cacheflush.h Thu Dec 11 20:22:51 2003
+++ linux-2.4.21/include/asm-mips/cacheflush.h Tue Apr 1 00:29:06 2003
@@ -46,16 +46,12 @@
extern void (*_flush_icache_all)(void);
extern void (*_flush_data_cache_page)(unsigned long addr);
-extern void (*_flush_page_to_ram)(struct page * page);
-
-#define flush_dcache_page(page) do { } while(0)
-
#define flush_cache_all() _flush_cache_all()
#define __flush_cache_all() ___flush_cache_all()
#define flush_cache_mm(mm) _flush_cache_mm(mm)
#define flush_cache_range(mm,start,end) _flush_cache_range(mm,start,end)
#define flush_cache_page(vma,page) _flush_cache_page(vma, page)
-#define flush_page_to_ram(page) _flush_page_to_ram(page)
+#define flush_page_to_ram(page) do { } while (0)
#define flush_icache_range(start, end) _flush_icache_range(start,end)
#define flush_icache_user_range(vma, page, addr, len) \
diff -urN linux-2.4.21-xxx/include/asm-mips/pgtable.h linux-2.4.21/include/asm-mips/pgtable.h
--- linux-2.4.21-xxx/include/asm-mips/pgtable.h Thu Dec 11 20:36:56 2003
+++ linux-2.4.21/include/asm-mips/pgtable.h Thu Dec 11 10:04:27 2003
@@ -261,7 +261,7 @@
unsigned long address, pte_t pte)
{
__update_tlb(vma, address, pte);
-// __update_cache(vma, address, pte);
+ __update_cache(vma, address, pte);
}
/* Swap entries must have VALID and GLOBAL bits cleared. */
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Kernel 2.4.23 on Cobalt Qube2 - area of problem
2003-12-12 10:45 Kernel 2.4.23 on Cobalt Qube2 - area of problem Peter Horton
@ 2003-12-13 17:07 ` Ralf Baechle
2003-12-13 18:20 ` Peter Horton
2003-12-13 18:38 ` Peter Horton
0 siblings, 2 replies; 4+ messages in thread
From: Ralf Baechle @ 2003-12-13 17:07 UTC (permalink / raw)
To: Peter Horton; +Cc: linux-mips
On Fri, Dec 12, 2003 at 10:45:08AM +0000, Peter Horton wrote:
> More info on the random segmentation faults and data corruption on my Qube2.
>
> 2.4.21 from CVS is the first kernel to exhibit the problem. I tracked it
> down to the cache handling changes that went in between 2.4.20 and 2.4.21.
>
> By (not very scientifically) removing flush_dcache_page() and
> re-instating flush_page_to_ram() I managed to get the 2.4.21 kernel
> stable (see attached patch). Applying a similiar patch to 2.4.23 (CVS
> HEAD) allows me to run 2.4.23 too.
>
> I don't know how to track the problem any further - the kernel's cache
> handling is a bit out of my league.
>
> Anyone got a clue stick they can point me in the right direction with ?
Can you put a printk into c-r4k.c and print the value of the
shm_align_mask variable? I want to make sure it's got a sane value on
your box. Also the first few lines of your bootup messages with the
processor and cache stuff would be useful.
Ralf
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Kernel 2.4.23 on Cobalt Qube2 - area of problem
2003-12-13 17:07 ` Ralf Baechle
@ 2003-12-13 18:20 ` Peter Horton
2003-12-13 18:38 ` Peter Horton
1 sibling, 0 replies; 4+ messages in thread
From: Peter Horton @ 2003-12-13 18:20 UTC (permalink / raw)
To: Ralf Baechle; +Cc: Peter Horton, linux-mips
On Sat, Dec 13, 2003 at 06:07:51PM +0100, Ralf Baechle wrote:
> On Fri, Dec 12, 2003 at 10:45:08AM +0000, Peter Horton wrote:
>
> > More info on the random segmentation faults and data corruption on my Qube2.
> >
> > 2.4.21 from CVS is the first kernel to exhibit the problem. I tracked it
> > down to the cache handling changes that went in between 2.4.20 and 2.4.21.
> >
> > By (not very scientifically) removing flush_dcache_page() and
> > re-instating flush_page_to_ram() I managed to get the 2.4.21 kernel
> > stable (see attached patch). Applying a similiar patch to 2.4.23 (CVS
> > HEAD) allows me to run 2.4.23 too.
> >
> > I don't know how to track the problem any further - the kernel's cache
> > handling is a bit out of my league.
> >
> > Anyone got a clue stick they can point me in the right direction with ?
>
> Can you put a printk into c-r4k.c and print the value of the
> shm_align_mask variable? I want to make sure it's got a sane value on
> your box. Also the first few lines of your bootup messages with the
> processor and cache stuff would be useful.
>
See below. All the cache settings and the shm_align_mask look fine
according to the RM5231 data sheet I have here.
At the moment the only change I make from the linux_2_4 HEAD kernel is
to update include/asm-mips/cacheflush.h so:
-#define flush_page_to_ram(page) do { } while(0)
+#define flush_page_to_ram(page) flush_dcache_page(page)
This single change gives me a rock solid kernel, so it masks the problem
somehow.
P.
CPU revision is: 000028a0
FPU revision is: 000028a0
D-cache:
size 32768
linesz 32
sets 512
ways 2
waysize 16384
waybit 14
I-cache:
size 32768
linesz 32
sets 512
ways 2
waysize 16384
waybit 14
Primary instruction cache 32kB, physically tagged, 2-way, linesize 32 bytes.
Primary data cache 32kB 2-way, linesize 32 bytes.
shm_align_mask 0x3fff
Linux version 2.4.23 (pdh@skeleton-jack) (gcc version 2.95.4 20011002 (Debian prerelease)) #6 Sat Dec 13 18:13:09 GMT 2003
Determined physical RAM map:
memory: 02000000 @ 00000000 (usable)
On node 0 totalpages: 8192
zone(0): 8192 pages.
zone(1): 0 pages.
zone(2): 0 pages.
Kernel command line: console=ttyS0,115200 ide1=noprobe root=/dev/hda2
ide_setup: ide1=noprobe
Calibrating delay loop... 249.03 BogoMIPS
Memory: 30380k/32768k available (1147k kernel code, 2388k reserved, 168k data, 100k init, 0k highmem)
...
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Kernel 2.4.23 on Cobalt Qube2 - area of problem
2003-12-13 17:07 ` Ralf Baechle
2003-12-13 18:20 ` Peter Horton
@ 2003-12-13 18:38 ` Peter Horton
1 sibling, 0 replies; 4+ messages in thread
From: Peter Horton @ 2003-12-13 18:38 UTC (permalink / raw)
To: Ralf Baechle; +Cc: Peter Horton, linux-mips
On Sat, Dec 13, 2003 at 06:07:51PM +0100, Ralf Baechle wrote:
> On Fri, Dec 12, 2003 at 10:45:08AM +0000, Peter Horton wrote:
>
> > More info on the random segmentation faults and data corruption on my Qube2.
> >
> > 2.4.21 from CVS is the first kernel to exhibit the problem. I tracked it
> > down to the cache handling changes that went in between 2.4.20 and 2.4.21.
> >
> > By (not very scientifically) removing flush_dcache_page() and
> > re-instating flush_page_to_ram() I managed to get the 2.4.21 kernel
> > stable (see attached patch). Applying a similiar patch to 2.4.23 (CVS
> > HEAD) allows me to run 2.4.23 too.
> >
> > I don't know how to track the problem any further - the kernel's cache
> > handling is a bit out of my league.
> >
> > Anyone got a clue stick they can point me in the right direction with ?
>
> Can you put a printk into c-r4k.c and print the value of the
> shm_align_mask variable? I want to make sure it's got a sane value on
> your box. Also the first few lines of your bootup messages with the
> processor and cache stuff would be useful.
>
I've just tried changing pages_do_alias() to always return non-zero. The
resulting kernel is still unstable - segmentation faults when loading
shared libraries, signal 11 when compiling etc.
P.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2003-12-13 18:38 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-12-12 10:45 Kernel 2.4.23 on Cobalt Qube2 - area of problem Peter Horton
2003-12-13 17:07 ` Ralf Baechle
2003-12-13 18:20 ` Peter Horton
2003-12-13 18:38 ` Peter Horton
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox