Linux PARISC architecture development
 help / color / mirror / Atom feed
* [parisc-linux] PA8800/ZX1 support committed to 2.6.7-rc2-pa2
@ 2004-06-04 20:25 Grant Grundler
  2004-06-05  6:51 ` Grant Grundler
  0 siblings, 1 reply; 14+ messages in thread
From: Grant Grundler @ 2004-06-04 20:25 UTC (permalink / raw)
  To: parisc-linux, parisc-linux-announce

I've committed PA8800/ZX1 support. It's obviously not complete
and more notes are here:
http://lists.parisc-linux.org/pipermail/parisc-linux-cvs/2004-June/034204.html

grant
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [parisc-linux] PA8800/ZX1 support committed to 2.6.7-rc2-pa2
  2004-06-04 20:25 [parisc-linux] PA8800/ZX1 support committed to 2.6.7-rc2-pa2 Grant Grundler
@ 2004-06-05  6:51 ` Grant Grundler
  2004-06-05 14:10   ` James Bottomley
  0 siblings, 1 reply; 14+ messages in thread
From: Grant Grundler @ 2004-06-05  6:51 UTC (permalink / raw)
  To: Grant Grundler; +Cc: parisc-linux

On Fri, Jun 04, 2004 at 02:25:46PM -0600, Grant Grundler wrote:
> I've committed PA8800/ZX1 support. It's obviously not complete
> and more notes are here:
> http://lists.parisc-linux.org/pipermail/parisc-linux-cvs/2004-June/034204.html

One of the errata I forgot to mention was Segmentation faults
or other transient failures. The failures are typically
segfaults but could "internal errors" to gcc or mis-reference
header in cpp. In all cases transient - ie not reproducible
on retry.

On the pa8800 machine, a "time make -j2" of the kernel would fail
in ~10 seconds. "time make" would fail anywhere from 30 seconds
to 6 minutes after starting. Re-starting the job would let it run
another chunk in time.

To me, it all suggest the PA8800 32M L2 cache isn't being
flushed when it might need to be. I don't think it's a new
problem. Just an old one that's easier to reproduce with
the bigger cache.

grant
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [parisc-linux] PA8800/ZX1 support committed to 2.6.7-rc2-pa2
  2004-06-05  6:51 ` Grant Grundler
@ 2004-06-05 14:10   ` James Bottomley
  2004-06-05 21:05     ` Grant Grundler
  0 siblings, 1 reply; 14+ messages in thread
From: James Bottomley @ 2004-06-05 14:10 UTC (permalink / raw)
  To: Grant Grundler; +Cc: PARISC list

On Sat, 2004-06-05 at 01:51, Grant Grundler wrote:
> To me, it all suggest the PA8800 32M L2 cache isn't being
> flushed when it might need to be. I don't think it's a new
> problem. Just an old one that's easier to reproduce with
> the bigger cache.

Your explanation is possible, but it's highly unlikely to be an existing
problem.  PA currently uses the big hammer approach to cache coherency
and flushes everything on virtually every large mmu changing operation. 
If there's a caching problem in between the flushes it should have shown
up on much smaller cache machines as well.

Also, when doing the no flushing updates to improve fork/exec, I removed
the global flushing so now any cache mismanagement would become
cumulative and should definitely have been seen.

My money would be on an additional architectural requirement of the
PA8800 (maybe even an existing PA one that the <PA8800 just don't need)
that we don't respect.

James


_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [parisc-linux] PA8800/ZX1 support committed to 2.6.7-rc2-pa2
  2004-06-05 14:10   ` James Bottomley
@ 2004-06-05 21:05     ` Grant Grundler
  2004-06-05 21:19       ` James Bottomley
  0 siblings, 1 reply; 14+ messages in thread
From: Grant Grundler @ 2004-06-05 21:05 UTC (permalink / raw)
  To: James Bottomley; +Cc: PARISC list

On Sat, Jun 05, 2004 at 09:10:51AM -0500, James Bottomley wrote:
> Your explanation is possible, but it's highly unlikely to be an existing
> problem.  PA currently uses the big hammer approach to cache coherency
> and flushes everything on virtually every large mmu changing operation. 
> If there's a caching problem in between the flushes it should have shown
> up on much smaller cache machines as well.
> 
> Also, when doing the no flushing updates to improve fork/exec, I removed
> the global flushing so now any cache mismanagement would become
> cumulative and should definitely have been seen.

Well, I can only point at the difference in cache size.

> My money would be on an additional architectural requirement of the
> PA8800 (maybe even an existing PA one that the <PA8800 just don't need)
> that we don't respect.

yes - and we've changed chipsets too.

Any good ideas on how to prove IO is coherent?

It might be the same problem that Naresh described as "SCSI DMA problems".
I just happen to be using NFS Root instead.

But I found one bug in Naresh's port that might explain his problem
(wasn't flushing IO TLB properly). It would be interesting to hear
if 2.6.7-rc2-pa3 works better for him.

grant
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [parisc-linux] PA8800/ZX1 support committed to 2.6.7-rc2-pa2
  2004-06-05 21:05     ` Grant Grundler
@ 2004-06-05 21:19       ` James Bottomley
  2004-06-05 22:21         ` Grant Grundler
  2004-06-11  5:58         ` Grant Grundler
  0 siblings, 2 replies; 14+ messages in thread
From: James Bottomley @ 2004-06-05 21:19 UTC (permalink / raw)
  To: Grant Grundler; +Cc: PARISC list

On Sat, 2004-06-05 at 16:05, Grant Grundler wrote:
> Well, I can only point at the difference in cache size.

The pa8800 only has a 750k/750k VIPT cache, that's smaller than my
raven.  The 32M L2 cache is PIPT, which doesn't suffer from aliasing or
address remapping effects---in fact, the PA engineers probably arranged
for a fdc not to flush it because there's no point; the only coherency
problems the PIPT cache has is with I/O, which is supposed to be fully
coherent in the ZX1, isn't it.  Thus, we'd only pick up a caching
problems like you describe from the VIPT caches.

> > My money would be on an additional architectural requirement of the
> > PA8800 (maybe even an existing PA one that the <PA8800 just don't need)
> > that we don't respect.
> 
> yes - and we've changed chipsets too.
> 
> Any good ideas on how to prove IO is coherent?

Well, yes, but not without driver magic.  You program a device to take a
piece of data in and rewrite it to a different buffer, then you compare
buffers (making sure the first had a pattern in it and the second was
completely clear).

> It might be the same problem that Naresh described as "SCSI DMA problems".
> I just happen to be using NFS Root instead.
> 
> But I found one bug in Naresh's port that might explain his problem
> (wasn't flushing IO TLB properly). It would be interesting to hear
> if 2.6.7-rc2-pa3 works better for him.

Well, it could be an I/O coherency problem, but if you have one of
those, I'm surprised it boots at all.

James


_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [parisc-linux] PA8800/ZX1 support committed to 2.6.7-rc2-pa2
  2004-06-05 21:19       ` James Bottomley
@ 2004-06-05 22:21         ` Grant Grundler
  2004-06-11  5:58         ` Grant Grundler
  1 sibling, 0 replies; 14+ messages in thread
From: Grant Grundler @ 2004-06-05 22:21 UTC (permalink / raw)
  To: James Bottomley; +Cc: PARISC list

On Sat, Jun 05, 2004 at 04:19:24PM -0500, James Bottomley wrote:
> On Sat, 2004-06-05 at 16:05, Grant Grundler wrote:
> The pa8800 only has a 750k/750k VIPT cache, that's smaller than my
> raven.  The 32M L2 cache is PIPT, which doesn't suffer from aliasing or
> address remapping effects---in fact, the PA engineers probably arranged
> for a fdc not to flush it because there's no point; the only coherency
> problems the PIPT cache has is with I/O, which is supposed to be fully
> coherent in the ZX1, isn't it.

Yes - especially since I haven't attempted to add any special support
for 64-bit cards. The "IOMMU Bypass" mode on ZX1 is worth implementing
for 64-bit cards and I might in fact require it for graphics and
infiniband support.


> Thus, we'd only pick up a caching
> problems like you describe from the VIPT caches.

ok.

> > Any good ideas on how to prove IO is coherent?
> 
> Well, yes, but not without driver magic.  You program a device to take a
> piece of data in and rewrite it to a different buffer, then you compare
> buffers (making sure the first had a pattern in it and the second was
> completely clear).

tg3 driver infact has such a test...let me think about that some more.
I might hack the test to be more exhaustive.

> Well, it could be an I/O coherency problem, but if you have one of
> those, I'm surprised it boots at all.

For that bug, one could occasionally end up with a stale IO TLB entry.
And I'm wondering if that's possible when "swapping in" different
parts of an executable binary. To date, I'm under the impression
only executable pages are seeing this problem.  If it were IO,
I should be seeing it with data too - ie .c file cause errors.
Maybe I just need to exercize it more to make that happen.

thanks,
grant
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [parisc-linux] PA8800/ZX1 support committed to 2.6.7-rc2-pa2
  2004-06-05 21:19       ` James Bottomley
  2004-06-05 22:21         ` Grant Grundler
@ 2004-06-11  5:58         ` Grant Grundler
  2004-06-11 14:02           ` James Bottomley
  2004-06-12  0:38           ` Jim Hull
  1 sibling, 2 replies; 14+ messages in thread
From: Grant Grundler @ 2004-06-11  5:58 UTC (permalink / raw)
  To: James Bottomley; +Cc: PARISC list

On Sat, Jun 05, 2004 at 04:19:24PM -0500, James Bottomley wrote:
> The pa8800 only has a 750k/750k VIPT cache, that's smaller than my
> raven.  The 32M L2 cache is PIPT, which doesn't suffer from aliasing or
> address remapping effects---in fact, the PA engineers probably arranged
> for a fdc not to flush it because there's no point; the only coherency
> problems the PIPT cache has is with I/O, which is supposed to be fully
> coherent in the ZX1, isn't it.  Thus, we'd only pick up a caching
> problems like you describe from the VIPT caches.


Well, I talked to another HP engineer who deals with this stuff
more than I do and came away with some observations:

1) values returned from the PDC_CACHE_INFO call (stride, block, loop)
   are supposed to be "universal" - ie apply to all levels of cache.
   The values are intended to work in the "architected cache flush loop".
   Does that only apply for the FDCE loop?
   What about FDC/FIC loops?


2) values from the C8000 prototype could be wrong. Output and diff
   are appended below. Basically it's telling me to traverse
   the entire 32MB cache with 128 byte stride.

3) I'm not sure if loop=1 indicates 2-way associative or direct mapped.
   I expect direct mapped.  I need to re-read the docs.
   Maybe someone knows? jsm?


Console output from debug info I enabled/added:
| model 9000/785/C8000
| ic_size 2000000 dc_size 2000000 it_size f0
| DC  base 0x0 stride 0x80 count 0x40000 loop 0x1
| dc_conf = 0x1882000  alias 0 blk 1 line 4 shift 1
|         wt 0 sh 0 cst 1 assoc 0
| IC  base 0x0 stride 0x80 count 0x40000 loop 0x1
| ic_conf = 0x1882000  alias 0 blk 1 line 4 shift 1
|         wt 0 sh 0 cst 1 assoc 0
| D-TLB conf: sh 3 page 1 cst 1 aid 0 pad1 0
| I-TLB conf: sh 3 page 1 cst 1 aid 3 pad1 0
| dcache_stride 128   icache_stride 128
| parisc_cache_init: Only equivalent aliasing supported!


The difference between D-TLB and I-TLB was interesting but
I don't know what to make of it.

I noticed HPUX was calculating the stride differently than
parisc-linux. I didn't realize until later the difference
could be due to older/newer firmware versions and/or differences
in the "architected loop" initialization.

Changing "4 + cnf.cc_shift" (stride=128) to "3 + cnf.cc_shift"
(stride=64) didn't help. This implies the stride is not the problem.

grant


Index: arch/parisc/kernel/cache.c
===================================================================
RCS file: /var/cvs/linux-2.6/arch/parisc/kernel/cache.c,v
retrieving revision 1.17
diff -u -p -r1.17 cache.c
--- arch/parisc/kernel/cache.c	30 May 2004 18:57:23 -0000	1.17
+++ arch/parisc/kernel/cache.c	11 Jun 2004 05:33:38 -0000
@@ -123,47 +123,56 @@ parisc_cache_init(void)
 	if (pdc_cache_info(&cache_info) < 0)
 		panic("parisc_cache_init: pdc_cache_info failed");
 
-#if 0
-	printk(KERN_DEBUG "ic_size %lx dc_size %lx it_size %lx pdc_cache_info %d*long pdc_cache_cf %d\n",
-	    cache_info.ic_size,
-	    cache_info.dc_size,
-	    cache_info.it_size,
-	    sizeof (struct pdc_cache_info) / sizeof (long),
-	    sizeof (struct pdc_cache_cf)
-	);
-
-	printk(KERN_DEBUG "dc base %x dc stride %x dc count %x dc loop %d\n",
-	    cache_info.dc_base,
-	    cache_info.dc_stride,
-	    cache_info.dc_count,
-	    cache_info.dc_loop);
-
-	printk(KERN_DEBUG "dc conf: alias %d block %d line %d wt %d sh %d cst %d assoc %d\n",
-	    cache_info.dc_conf.cc_alias,
-	    cache_info.dc_conf.cc_block,
-	    cache_info.dc_conf.cc_line,
-	    cache_info.dc_conf.cc_wt,
-	    cache_info.dc_conf.cc_sh,
-	    cache_info.dc_conf.cc_cst,
-	    cache_info.dc_conf.cc_assoc);
-
-	printk(KERN_DEBUG "ic conf: alias %d block %d line %d wt %d sh %d cst %d assoc %d\n",
-	    cache_info.ic_conf.cc_alias,
-	    cache_info.ic_conf.cc_block,
-	    cache_info.ic_conf.cc_line,
-	    cache_info.ic_conf.cc_wt,
-	    cache_info.ic_conf.cc_sh,
-	    cache_info.ic_conf.cc_cst,
-	    cache_info.ic_conf.cc_assoc);
+#if 1
+	printk("ic_size %lx dc_size %lx it_size %lx\n",
+		cache_info.ic_size,
+		cache_info.dc_size,
+		cache_info.it_size);
+
+	printk("DC  base 0x%lx stride 0x%lx count 0x%lx loop 0x%lx\n",
+		cache_info.dc_base,
+		cache_info.dc_stride,
+		cache_info.dc_count,
+		cache_info.dc_loop);
+
+	printk("dc_conf = 0x%lx  alias %d blk %d line %d shift %d\n",
+		*(unsigned long *) (&cache_info.dc_conf),
+		cache_info.dc_conf.cc_alias,
+		cache_info.dc_conf.cc_block,
+		cache_info.dc_conf.cc_line,
+		cache_info.dc_conf.cc_shift);
+	printk("	wt %d sh %d cst %d assoc %d\n",
+		cache_info.dc_conf.cc_wt,
+		cache_info.dc_conf.cc_sh,
+		cache_info.dc_conf.cc_cst,
+		cache_info.dc_conf.cc_assoc);
+
+	printk("IC  base 0x%lx stride 0x%lx count 0x%lx loop 0x%lx\n",
+		cache_info.ic_base,
+		cache_info.ic_stride,
+		cache_info.ic_count,
+		cache_info.ic_loop);
+
+	printk("ic_conf = 0x%lx  alias %d blk %d line %d shift %d\n",
+		*(unsigned long *) (&cache_info.ic_conf),
+		cache_info.ic_conf.cc_alias,
+		cache_info.ic_conf.cc_block,
+		cache_info.ic_conf.cc_line,
+		cache_info.ic_conf.cc_shift);
+	printk("	wt %d sh %d cst %d assoc %d\n",
+		cache_info.ic_conf.cc_wt,
+		cache_info.ic_conf.cc_sh,
+		cache_info.ic_conf.cc_cst,
+		cache_info.ic_conf.cc_assoc);
 
-	printk(KERN_DEBUG "dt conf: sh %d page %d cst %d aid %d pad1 %d \n",
+	printk("D-TLB conf: sh %d page %d cst %d aid %d pad1 %d \n",
 	    cache_info.dt_conf.tc_sh,
 	    cache_info.dt_conf.tc_page,
 	    cache_info.dt_conf.tc_cst,
 	    cache_info.dt_conf.tc_aid,
 	    cache_info.dt_conf.tc_pad1);
 
-	printk(KERN_DEBUG "it conf: sh %d page %d cst %d aid %d pad1 %d \n",
+	printk("I-TLB conf: sh %d page %d cst %d aid %d pad1 %d \n",
 	    cache_info.it_conf.tc_sh,
 	    cache_info.it_conf.tc_page,
 	    cache_info.it_conf.tc_cst,
@@ -180,10 +189,17 @@ parisc_cache_init(void)
 		split_tlb = 1;
 	}
 
-	dcache_stride = (1 << (cache_info.dc_conf.cc_block + 3)) *
-						cache_info.dc_conf.cc_line;
-	icache_stride = (1 << (cache_info.ic_conf.cc_block + 3)) *
-						cache_info.ic_conf.cc_line;
+#if 0
+#define CAFL_STRIDE(cnf) ((1 << (cnf.cc_block + 3)) * cnf.cc_line)
+#else
+#define CAFL_STRIDE(cnf) (cnf.cc_block * (cnf.cc_line << (4 + cnf.cc_shift)))
+#endif
+	dcache_stride = CAFL_STRIDE(cache_info.dc_conf);
+	icache_stride = CAFL_STRIDE(cache_info.ic_conf);
+#undef CAFL_STRIDE
+
+printk("dcache_stride %d   icache_stride %d\n", dcache_stride, icache_stride);
+
 #ifndef CONFIG_PA20
 	if (pdc_btlb_info(&btlb_info) < 0) {
 		memset(&btlb_info, 0, sizeof btlb_info);
@@ -192,8 +208,8 @@ parisc_cache_init(void)
 
 	if ((boot_cpu_data.pdc.capabilities & PDC_MODEL_NVA_MASK) ==
 						PDC_MODEL_NVA_UNSUPPORTED) {
-		printk(KERN_WARNING "Only equivalent aliasing supported\n");
-#ifndef CONFIG_SMP
+		printk(KERN_WARNING "parisc_cache_init: Only equivalent aliasing supported!\n");
+#if 0
 		panic("SMP kernel required to avoid non-equivalent aliasing");
 #endif
 	}
Index: include/asm-parisc/pdc.h
===================================================================
RCS file: /var/cvs/linux-2.6/include/asm-parisc/pdc.h,v
retrieving revision 1.7
diff -u -p -r1.7 pdc.h
--- include/asm-parisc/pdc.h	4 Jun 2004 19:36:53 -0000	1.7
+++ include/asm-parisc/pdc.h	11 Jun 2004 05:34:04 -0000
@@ -346,10 +346,10 @@ struct pdc_cache_cf {		/* for PDC_CACHE 
 #ifdef __LP64__
 		cc_padW:32,
 #endif
-		cc_alias:4,	/* alias boundaries for virtual addresses   */
+		cc_alias: 4,	/* alias boundaries for virtual addresses   */
 		cc_block: 4,	/* to determine most efficient stride */
 		cc_line	: 3,	/* maximum amount written back as a result of store (multiple of 16 bytes) */
-		cc_pad0 : 2,	/* reserved */
+		cc_shift: 2,	/* how much to shift cc_block left */
 		cc_wt	: 1,	/* 0 = WT-Dcache, 1 = WB-Dcache */
 		cc_sh	: 2,	/* 0 = separate I/D-cache, else shared I/D-cache */
 		cc_cst  : 3,	/* 0 = incoherent D-cache, 1=coherent D-cache */
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [parisc-linux] PA8800/ZX1 support committed to 2.6.7-rc2-pa2
  2004-06-11  5:58         ` Grant Grundler
@ 2004-06-11 14:02           ` James Bottomley
  2004-06-11 15:03             ` Grant Grundler
  2004-06-11 15:27             ` Grant Grundler
  2004-06-12  0:38           ` Jim Hull
  1 sibling, 2 replies; 14+ messages in thread
From: James Bottomley @ 2004-06-11 14:02 UTC (permalink / raw)
  To: Grant Grundler; +Cc: PARISC list

> Changing "4 + cnf.cc_shift" (stride=128) to "3 + cnf.cc_shift"
> (stride=64) didn't help. This implies the stride is not the problem.

The stride in the rest of the PA architecture is either 16 or 32; I'd be
surprised if the L1 stide in the 8800 were bigger (smaller stride is
actually more efficient since it gives finer control over caching).

Could you just hard code it to 32 to make assureances certain that the
stride value isn't the source of the segv's?

Thanks,

James


_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [parisc-linux] PA8800/ZX1 support committed to 2.6.7-rc2-pa2
  2004-06-11 14:02           ` James Bottomley
@ 2004-06-11 15:03             ` Grant Grundler
  2004-06-11 15:27             ` Grant Grundler
  1 sibling, 0 replies; 14+ messages in thread
From: Grant Grundler @ 2004-06-11 15:03 UTC (permalink / raw)
  To: James Bottomley; +Cc: PARISC list

On Fri, Jun 11, 2004 at 10:02:29AM -0400, James Bottomley wrote:
> > Changing "4 + cnf.cc_shift" (stride=128) to "3 + cnf.cc_shift"
> > (stride=64) didn't help. This implies the stride is not the problem.
> 
> The stride in the rest of the PA architecture is either 16 or 32; I'd be
> surprised if the L1 stride in the 8800 were bigger (smaller stride is
> actually more efficient since it gives finer control over caching).

I always thought the stride was the smallest cacheline size.

> Could you just hard code it to 32 to make assureances certain that the
> stride value isn't the source of the segv's?

It didn't help. sshd is still not working.
I'll try 16 as well.

pa8800:~# dmesg | fgrep sshd                                                    
do_page_fault() pid=457 command='sshd' type=15 address=0x80fc43ac               
do_page_fault() pid=461 command='sshd' type=6 address=0x00000003                
sshd(463): unaligned access to 0x00000000faf0105b at ip=0x00000000406e536b      
sshd(463): unaligned access to 0x00000000faf0107b at ip=0x00000000406e537f      
do_page_fault() pid=463 command='sshd' type=15 address=0x00000002               
do_page_fault() pid=465 command='sshd' type=15 address=0x0000000c               
do_page_fault() pid=469 command='sshd' type=15 address=0x0000000c               
sshd (pid 471): Illegal instruction (code 8) at 0000000000074b7b                


But I have to wonder why do RC scripts work?
And why can I login on the console?
I thought all of those things would fork/exec other binaries as well.
Given everything is coming over NFS root, I can't blame IO either.

grant
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [parisc-linux] PA8800/ZX1 support committed to 2.6.7-rc2-pa2
  2004-06-11 14:02           ` James Bottomley
  2004-06-11 15:03             ` Grant Grundler
@ 2004-06-11 15:27             ` Grant Grundler
  1 sibling, 0 replies; 14+ messages in thread
From: Grant Grundler @ 2004-06-11 15:27 UTC (permalink / raw)
  To: James Bottomley; +Cc: PARISC list

On Fri, Jun 11, 2004 at 10:02:29AM -0400, James Bottomley wrote:
> Could you just hard code it to 32 to make assureances certain that the
> stride value isn't the source of the segv's?

stride=16 doesn't work either.
sshd only page faults (so far) and sometimes "hangs" (^C to kill it).
Out of 9 attempts, I get one hang and 4 page faults.
"Connection closed" on all except the "hang".

Using console, I can
o apt-get update
o apt-get upgrade
o ssh -l grundler 192.168.1.1

So far, everything from the console works.
I'm not sure why sshd is having such a hard time.
I still don't think "stride" is the problem.

grant
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [parisc-linux] PA8800/ZX1 support committed to 2.6.7-rc2-pa2
  2004-06-11  5:58         ` Grant Grundler
  2004-06-11 14:02           ` James Bottomley
@ 2004-06-12  0:38           ` Jim Hull
  2004-06-14 18:29             ` Grant Grundler
  1 sibling, 1 reply; 14+ messages in thread
From: Jim Hull @ 2004-06-12  0:38 UTC (permalink / raw)
  To: 'Grant Grundler', 'James Bottomley'; +Cc: 'PARISC list'

Grant:

> Well, I talked to another HP engineer who deals with this stuff
> more than I do and came away with some observations:
>=20
> 1) values returned from the PDC_CACHE_INFO call (stride, block, loop)
>    are supposed to be "universal" - ie apply to all levels of cache.
>    The values are intended to work in the "architected cache=20
> flush loop".
>    Does that only apply for the FDCE loop?
>    What about FDC/FIC loops?

PDC_CACHE returns parameters for both the architected FDCE loop and the
architected FICE loop.  They are D_base, D_count, D_loop, and D_stride =
for the
FDCE loop and I_base, I_count, I_loop, and I_stride for the FICE loop.

The PDC_CACHE description also contains a Programming Note which explain =
how you
might use the values in the various bit-fields from two other returned
parameters (D_conf and I_conf) to flush a range of addresses using FDC =
or FIC.
Technically, these aren't "architected loops", they're just a =
suggestion.

> 2) values from the C8000 prototype could be wrong. Output and diff
>    are appended below. Basically it's telling me to traverse
>    the entire 32MB cache with 128 byte stride.

I checked the Mako ERS; it clearly describes what parameters PDC must =
return in
order for the flush loops to work.  The values you show match those in =
the ERS.

> 3) I'm not sure if loop=3D1 indicates 2-way associative or=20
> direct mapped.
>    I expect direct mapped.  I need to re-read the docs.
>    Maybe someone knows? jsm?

You should not think about the FDCE/FICE parameters as corresponding to =
any
particular property of the given cache.  They are simply abstract values =
to plug
into the architected flush loops in order to make them flush the whole =
cache.
For example, since the Mako caches are all 4-way associative, you might =
think
that the "loop" parameter(s) would be 4.  However, because of the design =
of the
Mako FDCE and FICE instructions, executing with "loop" equal 4, and with =
"count"
equal to 1/4 of its current value, would (sometimes) fail to flush the =
whole
cache.

> Console output from debug info I enabled/added:
> | model 9000/785/C8000
> | ic_size 2000000 dc_size 2000000 it_size f0
> | DC  base 0x0 stride 0x80 count 0x40000 loop 0x1
> | dc_conf =3D 0x1882000  alias 0 blk 1 line 4 shift 1
> |         wt 0 sh 0 cst 1 assoc 0
> | IC  base 0x0 stride 0x80 count 0x40000 loop 0x1
> | ic_conf =3D 0x1882000  alias 0 blk 1 line 4 shift 1
> |         wt 0 sh 0 cst 1 assoc 0
> | D-TLB conf: sh 3 page 1 cst 1 aid 0 pad1 0
> | I-TLB conf: sh 3 page 1 cst 1 aid 3 pad1 0
> | dcache_stride 128   icache_stride 128
> | parisc_cache_init: Only equivalent aliasing supported!
>=20
>=20
> The difference between D-TLB and I-TLB was interesting but
> I don't know what to make of it.
>=20
> I noticed HPUX was calculating the stride differently than
> parisc-linux. I didn't realize until later the difference
> could be due to older/newer firmware versions and/or differences
> in the "architected loop" initialization.

For FDC/FIC loops, the architecture was changed to handle machines with =
line
sizes larger than 64 bytes (like Mako, where it's 128).  The correct =
equation,
which will work on both new and old machines, is:

  stride =3D (1 << (block - 1)) * ((line * 16) << shift)

> Changing "4 + cnf.cc_shift" (stride=3D128) to "3 + cnf.cc_shift"
> (stride=3D64) didn't help. This implies the stride is not the problem.

If the "4 + cnf.cc_shift" is trying to compute a stride, then this may =
be the
source of the problem.  Try changing it as above.

 -- Jim

_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [parisc-linux] PA8800/ZX1 support committed to 2.6.7-rc2-pa2
  2004-06-12  0:38           ` Jim Hull
@ 2004-06-14 18:29             ` Grant Grundler
  2004-06-14 22:45               ` Jim Hull
  0 siblings, 1 reply; 14+ messages in thread
From: Grant Grundler @ 2004-06-14 18:29 UTC (permalink / raw)
  To: Jim Hull; +Cc: 'James Bottomley', 'PARISC list'

On Fri, Jun 11, 2004 at 05:38:24PM -0700, Jim Hull wrote:
> PDC_CACHE returns parameters for both the architected FDCE loop and the
> architected FICE loop.  They are D_base, D_count, D_loop, and D_stride for the
> FDCE loop and I_base, I_count, I_loop, and I_stride for the FICE loop.
> 
> The PDC_CACHE description also contains a Programming Note which explain how
> you might use the values in the various bit-fields from two other returned
> parameters (D_conf and I_conf) to flush a range of addresses using FDC or FIC.
> Technically, these aren't "architected loops", they're just a suggestion.

ah ok. In general, I prefer to do whatever I know HPUX is doing.
Because architected or not, there is a precendence that works (even
if for the wrong reasons).

> I checked the Mako ERS; it clearly describes what parameters PDC must
> return in order for the flush loops to work.  The values you show match
> those in the ERS.

Ok - thanks for confirming.

> > 3) I'm not sure if loop=1 indicates 2-way associative or 
> > direct mapped.
> >    I expect direct mapped.  I need to re-read the docs.
> >    Maybe someone knows? jsm?
> 
> You should not think about the FDCE/FICE parameters as corresponding to any
> particular property of the given cache.  They are simply abstract values to
> plug into the architected flush loops in order to make them flush the whole
> cache.

Understood. I'm not so worried about the "flush whole cache" as the
"flush range" functions where I thought we do need to know the
cache properties.

> For example, since the Mako caches are all 4-way associative, you might
> think that the "loop" parameter(s) would be 4.  However, because of the
> design of the Mako FDCE and FICE instructions, executing with "loop" equal
> 4, and with "count" equal to 1/4 of its current value, would (sometimes)
> fail to flush the whole cache.

Ouch. Does that mean FDCE and FICE modify the L2 cache also?
Do FDC/FIC also affect the L2?

Someone was asserting it was not necessary to flush the L2 since the L2 is
a PIPT cache and would always be coherent. I don't really know but
that makes sense to me.

...
> > I noticed HPUX was calculating the stride differently than
> > parisc-linux. I didn't realize until later the difference
> > could be due to older/newer firmware versions and/or differences
> > in the "architected loop" initialization.
> 
> For FDC/FIC loops, the architecture was changed to handle machines with line
> sizes larger than 64 bytes (like Mako, where it's 128).  The correct equation,
> which will work on both new and old machines, is:
> 
>   stride = (1 << (block - 1)) * ((line * 16) << shift)

The above results in stride=128 on C8000 as well (output appended below).
I now have three variants to calculate stride:

#define CAFL_STRIDE(cnf) ((1 << (cnf.cc_block + 3)) * cnf.cc_line)
#define CAFL_STRIDE(cnf) (cnf.cc_block * (cnf.cc_line << (4 + cnf.cc_shift)))
#define CAFL_STRIDE(cnf) ((1 << (cnf.cc_block-1)) * (cnf.cc_line << (4 + cnf.cc_shift)))

1) original parisc-linux
2) "borrowed" from HPUX (i80_latest, maybe HPUX wants to use
   Jim Hull's suggestion as well?)
3) "New and Improved" :^)

> If the "4 + cnf.cc_shift" is trying to compute a stride, then this may be the
> source of the problem.  Try changing it as above.

I've tested stride of 32/64/128 and they all behave the same.
I'm skeptical stride is the cause of the problem.

So far, just once have I had problems booting with stride=128 and that
was with a suspect kernel (not built/linked clean).
In general, things work from the console (eg ssh foo, apt-get update,
apt-get upgrade, RC scripts, etc).

But sshd segfaults or get's illegal insn faults when trying to ssh
into the c8000. Seems like if it were an IO coherency problem, I'd see
more problems during boot time too...maybe there is something
related to non-zero spaceid.

Jim, do you know if Space Id hashing is disabled the same way
on PA8800 as for PA8500/8600/8700 CPUs?

Or maybe it's easier to verify in the Make ERSthe following is correct:
(See arch/parisc/kernel/pacache.S)
srdis_pa20:

	/* Disable Space Register Hashing for PCXU,PCXU+,PCXW,PCXW+ */

	.word           0x144008bc  /* mfdiag %dr2,%r28 */
	depdi           0,54,1,%r28 /* clear DIAG_SPHASH_ENAB (bit 54) */
	.word           0x145c1840  /* mtdiag %r28,%dr2 */


thanks,
grant

...
Command line for kernel: 'root=/dev/nfs nfsroot=192.168.1.61:/home/tftpboot/pa8'
Selected kernel: HOME=/ from partition 0                                        
Warning: kernel name doesn't end with 32 or 64 -- Guessing... Choosing 64-bit ke
Entry 00100000 first 00100000 n 3                                               
Segment 0 load 00100000 size 5013888 mediaptr 0x1000                            
Segment 1 load 005cc000 size 362512 mediaptr 0x4ca000                           
Segment 2 load 00628000 size 393349 mediaptr 0x523000                           
Branching to kernel entry point 0x00100000.  If this is the last                
message you see, you may need to switch your console.  This is                  
a common symptom -- search the FAQ and mailing list at parisc-linux.org         
                                                                                
Linux version 2.6.7-rc2-pa4 (grundler@gsyprf11.external.hp.com) (gcc version 3.4
FP[0] enabled: Rev 1 Model 20                                                   
The 64-bit Kernel has started...                                                
Determining PDC firmware type: 64 bit PAT.                                      
model 000088a0 00000491 00000000 00000002 d4936494c9f85489 100000f0 00000008 002
vers  00000301                                                                  
CPUID vers 20 rev 4 (0x00000284)                                                
capabilities 0x35                                                               
model 9000/785/C8000                                                            
ic_size 2000000 dc_size 2000000 it_size f0                                      
DC  base 0x0 stride 0x80 count 0x40000 loop 0x1                                 
dc_conf = 0x1882000  alias 0 blk 1 line 4 shift 1                               
        wt 0 sh 0 cst 1 assoc 0                                                 
IC  base 0x0 stride 0x80 count 0x40000 loop 0x1                                 
ic_conf = 0x1882000  alias 0 blk 1 line 4 shift 1                               
        wt 0 sh 0 cst 1 assoc 0                                                 
D-TLB conf: sh 3 page 1 cst 1 aid 0 pad1 0                                      
I-TLB conf: sh 3 page 1 cst 1 aid 3 pad1 0                                      
dcache_stride 128   icache_stride 128                                           
parisc_cache_init: Only equivalent aliasing supported!                          
Total Memory: 1024 Mb                                                           
On node 0 totalpages: 262144                                                    
  DMA zone: 262144 pages, LIFO batch:16                                         
  Normal zone: 0 pages, LIFO batch:1                                            
  HighMem zone: 0 pages, LIFO batch:1                                           
Built 1 zonelists                                                               
Kernel command line: root=/dev/nfs nfsroot=192.168.1.61:/home/tftpboot/pa8800 i/
PID hash table entries: 16 (order 4: 256 bytes)  
...
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [parisc-linux] PA8800/ZX1 support committed to 2.6.7-rc2-pa2
  2004-06-14 18:29             ` Grant Grundler
@ 2004-06-14 22:45               ` Jim Hull
  2004-06-14 23:42                 ` Grant Grundler
  0 siblings, 1 reply; 14+ messages in thread
From: Jim Hull @ 2004-06-14 22:45 UTC (permalink / raw)
  To: 'Grant Grundler'; +Cc: 'James Bottomley', 'PARISC list'

Grant:

> Ouch. Does that mean FDCE and FICE modify the L2 cache also?
> Do FDC/FIC also affect the L2?

The PA processor architecture requires that all of these flush =
instructions
flush every level of the processor cache hierarchy all the way out to =
memory.
In other words, yes.=20

> Someone was asserting it was not necessary to flush the L2=20
> since the L2 is
> a PIPT cache and would always be coherent. I don't really know but
> that makes sense to me.

You could imagine an architecture having several variants of the cache =
flushing
instructions that software could call depending on why it needed to =
flush; for
example, to achieve I-cache/D-cache coherence, to resolve virtual =
aliasing, to
communicate with non-coherent I/O devices, or to flush out to a =
battery-backed
RAM in case of powerfail.

I think pa-linux wants the second of these.  Unfortunately, PA only has =
a single
variant, and it was designed for the third and fourth cases.

> ...
> > > I noticed HPUX was calculating the stride differently than
> > > parisc-linux. I didn't realize until later the difference
> > > could be due to older/newer firmware versions and/or differences
> > > in the "architected loop" initialization.
> >=20
> > For FDC/FIC loops, the architecture was changed to handle=20
> machines with line
> > sizes larger than 64 bytes (like Mako, where it's 128). =20
> The correct equation,
> > which will work on both new and old machines, is:
> >=20
> >   stride =3D (1 << (block - 1)) * ((line * 16) << shift)
>=20
> The above results in stride=3D128 on C8000 as well (output=20
> appended below).
> I now have three variants to calculate stride:
>=20
> #define CAFL_STRIDE(cnf) ((1 << (cnf.cc_block + 3)) * cnf.cc_line)
> #define CAFL_STRIDE(cnf) (cnf.cc_block * (cnf.cc_line << (4 +=20
> cnf.cc_shift)))
> #define CAFL_STRIDE(cnf) ((1 << (cnf.cc_block-1)) *=20
> (cnf.cc_line << (4 + cnf.cc_shift)))
>=20
> 1) original parisc-linux

This one doesn't include the new "shift" field, so it's clearly =
out-of-date.

> 2) "borrowed" from HPUX (i80_latest, maybe HPUX wants to use
>    Jim Hull's suggestion as well?)

This one is almost correct.  If "block" is 1 or 2, it gets the right =
answer -
it's only wrong for "block" values of 3 or more.  I suspect that HPUX =
appears to
work only because no PDC has ever returned anything other than 1 for =
"block".

> 3) "New and Improved" :^)

And correct!  If you'd like a more optimized version, this is =
equivalent:

#define CAFL_STRIDE(cnf) (cnf.cc_line << (3 + cnf.cc_block + =
cnf.cc_shift))

> Jim, do you know if Space Id hashing is disabled the same way
> on PA8800 as for PA8500/8600/8700 CPUs?
>=20
> Or maybe it's easier to verify in the Make ERSthe following=20
> is correct:
> (See arch/parisc/kernel/pacache.S)
> srdis_pa20:
>=20
> 	/* Disable Space Register Hashing for PCXU,PCXU+,PCXW,PCXW+ */
>=20
> 	.word           0x144008bc  /* mfdiag %dr2,%r28 */
> 	depdi           0,54,1,%r28 /* clear DIAG_SPHASH_ENAB=20
> (bit 54) */
> 	.word           0x145c1840  /* mtdiag %r28,%dr2 */

Yes, the Mako ERS says that DIAG_SPHASH_ENAB is still bit 54 in diag =
register 2.

 -- Jim

_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [parisc-linux] PA8800/ZX1 support committed to 2.6.7-rc2-pa2
  2004-06-14 22:45               ` Jim Hull
@ 2004-06-14 23:42                 ` Grant Grundler
  0 siblings, 0 replies; 14+ messages in thread
From: Grant Grundler @ 2004-06-14 23:42 UTC (permalink / raw)
  To: Jim Hull; +Cc: 'PARISC list'

On Mon, Jun 14, 2004 at 03:45:32PM -0700, Jim Hull wrote:
...
> You could imagine an architecture having several variants of the cache flushing
> instructions that software could call depending on why it needed to flush; for
> example, to achieve I-cache/D-cache coherence, to resolve virtual aliasing, to
> communicate with non-coherent I/O devices, or to flush out to a battery-backed
> RAM in case of powerfail.
> 
> I think pa-linux wants the second of these.

Yes - I left out that part in my paraphrasing of "someone" 's comments.

>   Unfortunately, PA only has a single
> variant, and it was designed for the third and fourth cases.

ok.

> > 2) "borrowed" from HPUX (i80_latest, maybe HPUX wants to use
> >    Jim Hull's suggestion as well?)
> 
> This one is almost correct.  If "block" is 1 or 2, it gets the right
> answer - it's only wrong for "block" values of 3 or more.
> I suspect that HPUX appears to work only because no PDC has ever
> returned anything other than 1 for "block".

Ok. And PDC probably won't if they don't have to even if it would
be more efficient because of "pre-enablement" of existing OS.
I have no idea if PDC ever will need to.

> > 3) "New and Improved" :^)
> 
> And correct!  If you'd like a more optimized version, this is equivalent:
> 
> #define CAFL_STRIDE(cnf) (cnf.cc_line << (3 + cnf.cc_block + cnf.cc_shift))

yes - that's nicer. I'll keep the original "simple math" version
to document the origin.

> Yes, the Mako ERS says that DIAG_SPHASH_ENAB is still bit 54 in
> diag register 2.

Ok - thank you.

I'm out of ideas for the moment what might be wrong.
Maybe I'll troll the HPUX source tree for any PA8800 
or ZX1 specific code changes...

*sigh*

thanks,
grant
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2004-06-14 23:42 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-06-04 20:25 [parisc-linux] PA8800/ZX1 support committed to 2.6.7-rc2-pa2 Grant Grundler
2004-06-05  6:51 ` Grant Grundler
2004-06-05 14:10   ` James Bottomley
2004-06-05 21:05     ` Grant Grundler
2004-06-05 21:19       ` James Bottomley
2004-06-05 22:21         ` Grant Grundler
2004-06-11  5:58         ` Grant Grundler
2004-06-11 14:02           ` James Bottomley
2004-06-11 15:03             ` Grant Grundler
2004-06-11 15:27             ` Grant Grundler
2004-06-12  0:38           ` Jim Hull
2004-06-14 18:29             ` Grant Grundler
2004-06-14 22:45               ` Jim Hull
2004-06-14 23:42                 ` Grant Grundler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox