Linux MIPS Architecture development
 help / color / mirror / Atom feed
* Re: Kernel 2.4.23 on Cobalt Qube2 - area of problem
@ 2003-12-12 10:45 Peter Horton
  2003-12-13 17:07 ` Ralf Baechle
  0 siblings, 1 reply; 4+ messages in thread
From: Peter Horton @ 2003-12-12 10:45 UTC (permalink / raw)
  To: linux-mips

[-- Attachment #1: Type: text/plain, Size: 658 bytes --]

More info on the random segmentation faults and data corruption on my Qube2.

2.4.21 from CVS is the first kernel to exhibit the problem. I tracked it 
down to the cache handling changes that went in between 2.4.20 and 2.4.21.

By (not very scientifically) removing flush_dcache_page() and 
re-instating flush_page_to_ram() I managed to get the 2.4.21 kernel 
stable (see attached patch). Applying a similiar patch to 2.4.23 (CVS 
HEAD) allows me to run 2.4.23 too.

I don't know how to track the problem any further - the kernel's cache 
handling is a bit out of my league.

Anyone got a clue stick they can point me in the right direction with ?

P.







[-- Attachment #2: cobalt-patch --]
[-- Type: text/plain, Size: 3261 bytes --]

diff -urN linux-2.4.21-xxx/arch/mips/mm/c-r4k.c linux-2.4.21/arch/mips/mm/c-r4k.c
--- linux-2.4.21-xxx/arch/mips/mm/c-r4k.c	Thu Dec 11 20:20:28 2003
+++ linux-2.4.21/arch/mips/mm/c-r4k.c	Thu Dec 11 10:02:54 2003
@@ -1037,16 +1037,6 @@
 	return 1;
 }
 
-static void r4k_flush_page_to_ram_d16(struct page *page)
-{
-	blast_dcache16_page((unsigned long)page_address(page));
-}
-
-static void r4k_flush_page_to_ram_d32(struct page *page)
-{
-	blast_dcache32_page((unsigned long)page_address(page));
-}
-
 static void __init setup_noscache_funcs(void)
 {
 	unsigned int prid;
@@ -1059,7 +1049,6 @@
 			_clear_page = r4k_clear_page32_d16;
 		_copy_page = r4k_copy_page_d16;
 
-		_flush_page_to_ram = r4k_flush_page_to_ram_d16;
 		break;
 	case 32:
 		prid = read_c0_prid() & 0xfff0;
@@ -1076,8 +1065,6 @@
 				_clear_page = r4k_clear_page32_d32;
 			_copy_page = r4k_copy_page_d32;
 		}
-
-		_flush_page_to_ram = r4k_flush_page_to_ram_d32;
 		break;
 	}
 }
diff -urN linux-2.4.21-xxx/arch/mips/mm/cache.c linux-2.4.21/arch/mips/mm/cache.c
--- linux-2.4.21-xxx/arch/mips/mm/cache.c	Thu Dec 11 20:15:37 2003
+++ linux-2.4.21/arch/mips/mm/cache.c	Fri Jul 18 15:16:06 2003
@@ -20,8 +20,6 @@
 	return 0;
 }
 
-#if 0
-
 void flush_dcache_page(struct page *page)
 {
 	unsigned long addr;
@@ -67,5 +65,3 @@
 }
 
 EXPORT_SYMBOL(flush_dcache_page);
-
-#endif
diff -urN linux-2.4.21-xxx/arch/mips/mm/loadmmu.c linux-2.4.21/arch/mips/mm/loadmmu.c
--- linux-2.4.21-xxx/arch/mips/mm/loadmmu.c	Thu Dec 11 20:25:23 2003
+++ linux-2.4.21/arch/mips/mm/loadmmu.c	Thu Dec 11 10:02:54 2003
@@ -40,8 +40,6 @@
 void (*_flush_data_cache_page)(unsigned long addr);
 void (*_flush_icache_all)(void);
 
-void (*_flush_page_to_ram)(struct page * page);
-
 #ifdef CONFIG_NONCOHERENT_IO
 
 /* DMA cache operations. */
diff -urN linux-2.4.21-xxx/include/asm-mips/cacheflush.h linux-2.4.21/include/asm-mips/cacheflush.h
--- linux-2.4.21-xxx/include/asm-mips/cacheflush.h	Thu Dec 11 20:22:51 2003
+++ linux-2.4.21/include/asm-mips/cacheflush.h	Tue Apr  1 00:29:06 2003
@@ -46,16 +46,12 @@
 extern void (*_flush_icache_all)(void);
 extern void (*_flush_data_cache_page)(unsigned long addr);
 
-extern void (*_flush_page_to_ram)(struct page * page);
-
-#define flush_dcache_page(page)		do { } while(0)
-
 #define flush_cache_all()		_flush_cache_all()
 #define __flush_cache_all()		___flush_cache_all()
 #define flush_cache_mm(mm)		_flush_cache_mm(mm)
 #define flush_cache_range(mm,start,end)	_flush_cache_range(mm,start,end)
 #define flush_cache_page(vma,page)	_flush_cache_page(vma, page)
-#define flush_page_to_ram(page)		_flush_page_to_ram(page)
+#define flush_page_to_ram(page)		do { } while (0)
 
 #define flush_icache_range(start, end)	_flush_icache_range(start,end)
 #define flush_icache_user_range(vma, page, addr, len) \
diff -urN linux-2.4.21-xxx/include/asm-mips/pgtable.h linux-2.4.21/include/asm-mips/pgtable.h
--- linux-2.4.21-xxx/include/asm-mips/pgtable.h	Thu Dec 11 20:36:56 2003
+++ linux-2.4.21/include/asm-mips/pgtable.h	Thu Dec 11 10:04:27 2003
@@ -261,7 +261,7 @@
 	unsigned long address, pte_t pte)
 {
 	__update_tlb(vma, address, pte);
-//	__update_cache(vma, address, pte);
+	__update_cache(vma, address, pte);
 }
 
 /* Swap entries must have VALID and GLOBAL bits cleared. */

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Kernel 2.4.23 on Cobalt Qube2 - area of problem
  2003-12-12 10:45 Kernel 2.4.23 on Cobalt Qube2 - area of problem Peter Horton
@ 2003-12-13 17:07 ` Ralf Baechle
  2003-12-13 18:20   ` Peter Horton
  2003-12-13 18:38   ` Peter Horton
  0 siblings, 2 replies; 4+ messages in thread
From: Ralf Baechle @ 2003-12-13 17:07 UTC (permalink / raw)
  To: Peter Horton; +Cc: linux-mips

On Fri, Dec 12, 2003 at 10:45:08AM +0000, Peter Horton wrote:

> More info on the random segmentation faults and data corruption on my Qube2.
> 
> 2.4.21 from CVS is the first kernel to exhibit the problem. I tracked it 
> down to the cache handling changes that went in between 2.4.20 and 2.4.21.
> 
> By (not very scientifically) removing flush_dcache_page() and 
> re-instating flush_page_to_ram() I managed to get the 2.4.21 kernel 
> stable (see attached patch). Applying a similiar patch to 2.4.23 (CVS 
> HEAD) allows me to run 2.4.23 too.
> 
> I don't know how to track the problem any further - the kernel's cache 
> handling is a bit out of my league.
> 
> Anyone got a clue stick they can point me in the right direction with ?

Can you put a printk into c-r4k.c and print the value of the
shm_align_mask variable?  I want to make sure it's got a sane value on
your box.  Also the first few lines of your bootup messages with the
processor and cache stuff would be useful.

  Ralf

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Kernel 2.4.23 on Cobalt Qube2 - area of problem
  2003-12-13 17:07 ` Ralf Baechle
@ 2003-12-13 18:20   ` Peter Horton
  2003-12-13 18:38   ` Peter Horton
  1 sibling, 0 replies; 4+ messages in thread
From: Peter Horton @ 2003-12-13 18:20 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: Peter Horton, linux-mips

On Sat, Dec 13, 2003 at 06:07:51PM +0100, Ralf Baechle wrote:
> On Fri, Dec 12, 2003 at 10:45:08AM +0000, Peter Horton wrote:
> 
> > More info on the random segmentation faults and data corruption on my Qube2.
> > 
> > 2.4.21 from CVS is the first kernel to exhibit the problem. I tracked it 
> > down to the cache handling changes that went in between 2.4.20 and 2.4.21.
> > 
> > By (not very scientifically) removing flush_dcache_page() and 
> > re-instating flush_page_to_ram() I managed to get the 2.4.21 kernel 
> > stable (see attached patch). Applying a similiar patch to 2.4.23 (CVS 
> > HEAD) allows me to run 2.4.23 too.
> > 
> > I don't know how to track the problem any further - the kernel's cache 
> > handling is a bit out of my league.
> > 
> > Anyone got a clue stick they can point me in the right direction with ?
> 
> Can you put a printk into c-r4k.c and print the value of the
> shm_align_mask variable?  I want to make sure it's got a sane value on
> your box.  Also the first few lines of your bootup messages with the
> processor and cache stuff would be useful.
> 

See below. All the cache settings and the shm_align_mask look fine
according to the RM5231 data sheet I have here.

At the moment the only change I make from the linux_2_4 HEAD kernel is
to update include/asm-mips/cacheflush.h so:

-#define flush_page_to_ram(page)	do { } while(0)
+#define flush_page_to_ram(page)	flush_dcache_page(page)

This single change gives me a rock solid kernel, so it masks the problem
somehow.

P.

CPU revision is: 000028a0
FPU revision is: 000028a0
D-cache:
  size    32768
  linesz  32
  sets    512
  ways    2
  waysize 16384
  waybit  14
I-cache:
  size    32768
  linesz  32
  sets    512
  ways    2
  waysize 16384
  waybit  14
Primary instruction cache 32kB, physically tagged, 2-way, linesize 32 bytes.
Primary data cache 32kB 2-way, linesize 32 bytes.
shm_align_mask 0x3fff
Linux version 2.4.23 (pdh@skeleton-jack) (gcc version 2.95.4 20011002 (Debian prerelease)) #6 Sat Dec 13 18:13:09 GMT 2003
Determined physical RAM map:
 memory: 02000000 @ 00000000 (usable)
On node 0 totalpages: 8192
zone(0): 8192 pages.
zone(1): 0 pages.
zone(2): 0 pages.
Kernel command line: console=ttyS0,115200 ide1=noprobe root=/dev/hda2 
ide_setup: ide1=noprobe
Calibrating delay loop... 249.03 BogoMIPS
Memory: 30380k/32768k available (1147k kernel code, 2388k reserved, 168k data, 100k init, 0k highmem)
...

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Kernel 2.4.23 on Cobalt Qube2 - area of problem
  2003-12-13 17:07 ` Ralf Baechle
  2003-12-13 18:20   ` Peter Horton
@ 2003-12-13 18:38   ` Peter Horton
  1 sibling, 0 replies; 4+ messages in thread
From: Peter Horton @ 2003-12-13 18:38 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: Peter Horton, linux-mips

On Sat, Dec 13, 2003 at 06:07:51PM +0100, Ralf Baechle wrote:
> On Fri, Dec 12, 2003 at 10:45:08AM +0000, Peter Horton wrote:
> 
> > More info on the random segmentation faults and data corruption on my Qube2.
> > 
> > 2.4.21 from CVS is the first kernel to exhibit the problem. I tracked it 
> > down to the cache handling changes that went in between 2.4.20 and 2.4.21.
> > 
> > By (not very scientifically) removing flush_dcache_page() and 
> > re-instating flush_page_to_ram() I managed to get the 2.4.21 kernel 
> > stable (see attached patch). Applying a similiar patch to 2.4.23 (CVS 
> > HEAD) allows me to run 2.4.23 too.
> > 
> > I don't know how to track the problem any further - the kernel's cache 
> > handling is a bit out of my league.
> > 
> > Anyone got a clue stick they can point me in the right direction with ?
> 
> Can you put a printk into c-r4k.c and print the value of the
> shm_align_mask variable?  I want to make sure it's got a sane value on
> your box.  Also the first few lines of your bootup messages with the
> processor and cache stuff would be useful.
> 

I've just tried changing pages_do_alias() to always return non-zero. The
resulting kernel is still unstable - segmentation faults when loading
shared libraries, signal 11 when compiling etc.

P.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2003-12-13 18:38 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-12-12 10:45 Kernel 2.4.23 on Cobalt Qube2 - area of problem Peter Horton
2003-12-13 17:07 ` Ralf Baechle
2003-12-13 18:20   ` Peter Horton
2003-12-13 18:38   ` Peter Horton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox