* RM7k cache_flush_sigtramp
@ 2003-07-31 1:56 Fuxin Zhang
2003-07-31 11:46 ` Ralf Baechle
0 siblings, 1 reply; 22+ messages in thread
From: Fuxin Zhang @ 2003-07-31 1:56 UTC (permalink / raw)
To: MAKE FUN PRANK CALLS
hi,
r4k_cache_flush_sigtrap seems not enough for RM7000 cpus because
there is a writebuffer between L1 dcache & L2 cache,so the written back
block may not be seen by icache. This small patch fixes crashes of my
Xserver
on ev64240.
--- r4kcache.h.ori 2003-07-31 09:51:01.000000000 +0800
+++ r4kcache.h 2003-07-31 09:51:57.000000000 +0800
@@ -94,6 +94,9 @@
".set noreorder\n\t"
".set mips3\n"
"1:\tcache %0,(%1)\n"
+#ifdef CONFIG_CPU_RM7000
+ "sync\n\t"
+#endif
"2:\t.set mips0\n\t"
".set reorder\n\t"
".section\t__ex_table,\"a\"\n\t"
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: RM7k cache_flush_sigtramp
2003-07-31 1:56 Fuxin Zhang
@ 2003-07-31 11:46 ` Ralf Baechle
2003-07-31 12:57 ` Fuxin Zhang
0 siblings, 1 reply; 22+ messages in thread
From: Ralf Baechle @ 2003-07-31 11:46 UTC (permalink / raw)
To: Fuxin Zhang; +Cc: MAKE FUN PRANK CALLS
On Thu, Jul 31, 2003 at 09:56:08AM +0800, Fuxin Zhang wrote:
> Date: Thu, 31 Jul 2003 09:56:08 +0800
> From: Fuxin Zhang <fxzhang@ict.ac.cn>
> To: MAKE FUN PRANK CALLS <linux-mips@linux-mips.org>
^^^^^^^^^^^^^^^^^^^^
Funny name for the list :-)
> r4k_cache_flush_sigtrap seems not enough for RM7000 cpus because
> there is a writebuffer between L1 dcache & L2 cache,so the written back
> block may not be seen by icache. This small patch fixes crashes of my
> Xserver on ev64240.
It would seem a similar fix is also needed in other places then?
Ralf
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: RM7k cache_flush_sigtramp
2003-07-31 11:46 ` Ralf Baechle
@ 2003-07-31 12:57 ` Fuxin Zhang
0 siblings, 0 replies; 22+ messages in thread
From: Fuxin Zhang @ 2003-07-31 12:57 UTC (permalink / raw)
To: Ralf Baechle; +Cc: MAKE FUN PRANK CALLS
Ralf Baechle wrote:
>On Thu, Jul 31, 2003 at 09:56:08AM +0800, Fuxin Zhang wrote:
>
>
>>Date: Thu, 31 Jul 2003 09:56:08 +0800
>>From: Fuxin Zhang <fxzhang@ict.ac.cn>
>>To: MAKE FUN PRANK CALLS <linux-mips@linux-mips.org>
>>
>>
> ^^^^^^^^^^^^^^^^^^^^
>
>Funny name for the list :-)
>
>
>
>>r4k_cache_flush_sigtrap seems not enough for RM7000 cpus because
>>there is a writebuffer between L1 dcache & L2 cache,so the written back
>>block may not be seen by icache. This small patch fixes crashes of my
>>Xserver on ev64240.
>>
>>
>
>It would seem a similar fix is also needed in other places then?
>
I have not thought about it further. But
1. I implement wb_flush for this board,using sync and uncached read.
Just in case
so many buffer on the cpu and system bridge will surprise me.
2. There are still occasionally oops, especially with IO
activities,e.g.,when fscking a disk.
What would should suggest to look at? Some flushes will go through all
levels of cache,
I think they should be safe. Will check later.
Thanks.
>
> Ralf
>
>
>
>
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: RM7k cache_flush_sigtramp
@ 2003-07-31 16:50 Adam Kiepul
2003-08-01 0:40 ` Fuxin Zhang
2003-08-01 7:51 ` Dominic Sweetman
0 siblings, 2 replies; 22+ messages in thread
From: Adam Kiepul @ 2003-07-31 16:50 UTC (permalink / raw)
To: 'Ralf Baechle', Fuxin Zhang; +Cc: MAKE FUN PRANK CALLS
Hi,
If this is just to ensure the I Cache coherency for modified code then the following should be sufficient:
cache Hit_Writeback_D, offset(base_register)
cache Hit_Invalidate_I, offset(base_register)
The ordering does matter however since the Hit_Invalidate_I makes sure the write buffer is flushed.
Kind Regards,
_______________________________
Adam Kiepul
Sr. Applications Engineer
PMC-Sierra, Microprocessor Division
Mission Towers One
3975 Freedom Circle
Santa Clara, CA 95054, USA
Direct: 408 239 8124
Fax: 408 492 9462
-----Original Message-----
From: Ralf Baechle [mailto:ralf@linux-mips.org]
Sent: Thursday, July 31, 2003 4:47 AM
To: Fuxin Zhang
Cc: MAKE FUN PRANK CALLS
Subject: Re: RM7k cache_flush_sigtramp
On Thu, Jul 31, 2003 at 09:56:08AM +0800, Fuxin Zhang wrote:
> Date: Thu, 31 Jul 2003 09:56:08 +0800
> From: Fuxin Zhang <fxzhang@ict.ac.cn>
> To: MAKE FUN PRANK CALLS <linux-mips@linux-mips.org>
^^^^^^^^^^^^^^^^^^^^
Funny name for the list :-)
> r4k_cache_flush_sigtrap seems not enough for RM7000 cpus because
> there is a writebuffer between L1 dcache & L2 cache,so the written back
> block may not be seen by icache. This small patch fixes crashes of my
> Xserver on ev64240.
It would seem a similar fix is also needed in other places then?
Ralf
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: RM7k cache_flush_sigtramp
2003-07-31 16:50 Adam Kiepul
@ 2003-08-01 0:40 ` Fuxin Zhang
2003-08-01 3:01 ` Ralf Baechle
2003-08-01 7:51 ` Dominic Sweetman
1 sibling, 1 reply; 22+ messages in thread
From: Fuxin Zhang @ 2003-08-01 0:40 UTC (permalink / raw)
To: Adam Kiepul, MAKE FUN PRANK CALLS
Adam Kiepul wrote:
>Hi,
>
>If this is just to ensure the I Cache coherency for modified code then the following should be sufficient:
>
>cache Hit_Writeback_D, offset(base_register)
>cache Hit_Invalidate_I, offset(base_register)
>
>
Current linux code does exactly this. But I was seeing all kinds of
faults occuring around the
sigreturn point on the stack without a sync? And a sync does greatly
improve the stablity.
>The ordering does matter however since the Hit_Invalidate_I makes sure the write buffer is flushed.
>
>Kind Regards,
>
>_______________________________
>
>Adam Kiepul
>Sr. Applications Engineer
>
>PMC-Sierra, Microprocessor Division
>Mission Towers One
>3975 Freedom Circle
>Santa Clara, CA 95054, USA
>Direct: 408 239 8124
>Fax: 408 492 9462
>
>
>
>-----Original Message-----
>From: Ralf Baechle [mailto:ralf@linux-mips.org]
>Sent: Thursday, July 31, 2003 4:47 AM
>To: Fuxin Zhang
>Cc: MAKE FUN PRANK CALLS
>Subject: Re: RM7k cache_flush_sigtramp
>
>
>On Thu, Jul 31, 2003 at 09:56:08AM +0800, Fuxin Zhang wrote:
>
>
>>Date: Thu, 31 Jul 2003 09:56:08 +0800
>>From: Fuxin Zhang <fxzhang@ict.ac.cn>
>>To: MAKE FUN PRANK CALLS <linux-mips@linux-mips.org>
>>
>>
> ^^^^^^^^^^^^^^^^^^^^
>
>Funny name for the list :-)
>
>
>
>>r4k_cache_flush_sigtrap seems not enough for RM7000 cpus because
>>there is a writebuffer between L1 dcache & L2 cache,so the written back
>>block may not be seen by icache. This small patch fixes crashes of my
>>Xserver on ev64240.
>>
>>
>
>It would seem a similar fix is also needed in other places then?
>
> Ralf
>
>
>
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: RM7k cache_flush_sigtramp
2003-08-01 0:40 ` Fuxin Zhang
@ 2003-08-01 3:01 ` Ralf Baechle
2003-08-01 4:59 ` Fuxin Zhang
0 siblings, 1 reply; 22+ messages in thread
From: Ralf Baechle @ 2003-08-01 3:01 UTC (permalink / raw)
To: Fuxin Zhang; +Cc: Adam Kiepul, MAKE FUN PRANK CALLS
Adam,
On Fri, Aug 01, 2003 at 08:40:14AM +0800, Fuxin Zhang wrote:
> Current linux code does exactly this. But I was seeing all kinds of
> faults occuring around the
> sigreturn point on the stack without a sync? And a sync does greatly
> improve the stablity.
>
> >The ordering does matter however since the Hit_Invalidate_I makes sure the
> >write buffer is flushed.
could there be an errata explaining Fuxin's findings?
Fuxin, what version are you running?
Ralf
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: RM7k cache_flush_sigtramp
2003-08-01 3:01 ` Ralf Baechle
@ 2003-08-01 4:59 ` Fuxin Zhang
0 siblings, 0 replies; 22+ messages in thread
From: Fuxin Zhang @ 2003-08-01 4:59 UTC (permalink / raw)
To: Ralf Baechle; +Cc: Adam Kiepul, MAKE FUN PRANK CALLS
I am using a slightly modified 2.4.21-pre4,based on cvs of early this
month(?).
We have merged with latest cvs, I will run it and report the result tonight.
Ralf Baechle wrote:
>Adam,
>
>On Fri, Aug 01, 2003 at 08:40:14AM +0800, Fuxin Zhang wrote:
>
>
>
>>Current linux code does exactly this. But I was seeing all kinds of
>>faults occuring around the
>>sigreturn point on the stack without a sync? And a sync does greatly
>>improve the stablity.
>>
>>
>>
>>>The ordering does matter however since the Hit_Invalidate_I makes sure the
>>>write buffer is flushed.
>>>
>>>
>
>could there be an errata explaining Fuxin's findings?
>
>Fuxin, what version are you running?
>
> Ralf
>
>
>
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: RM7k cache_flush_sigtramp
2003-07-31 16:50 Adam Kiepul
2003-08-01 0:40 ` Fuxin Zhang
@ 2003-08-01 7:51 ` Dominic Sweetman
2003-08-01 7:51 ` Dominic Sweetman
2003-08-01 9:26 ` Ralf Baechle
1 sibling, 2 replies; 22+ messages in thread
From: Dominic Sweetman @ 2003-08-01 7:51 UTC (permalink / raw)
To: Adam Kiepul; +Cc: 'Ralf Baechle', Fuxin Zhang, linux-mips
> If this is just to ensure the I Cache coherency for modified code
> then the following should be sufficient:
>
> cache Hit_Writeback_D, offset(base_register)
> cache Hit_Invalidate_I, offset(base_register)
>
> The ordering does matter however since the Hit_Invalidate_I makes
> sure the write buffer is flushed.
I'm probably jumping into the middle of something, sorry...
The MIPS32/MIPS64 release 2 architecture includes a useful instruction
SYNCI which does the whole job (repeat on each affected cache line)
and is legal in user mode; this will take a while to spread but I'd
recommend it as a model worth following.
So I hope that kernels will provide one function for "I've just
written instructions and now I want to execute them", and not export
the separate writeback-D/invalidate-I interface.
--
Dominic Sweetman
MIPS Technologies
dom@mips.com
^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: RM7k cache_flush_sigtramp
2003-08-01 7:51 ` Dominic Sweetman
@ 2003-08-01 7:51 ` Dominic Sweetman
2003-08-01 9:26 ` Ralf Baechle
1 sibling, 0 replies; 22+ messages in thread
From: Dominic Sweetman @ 2003-08-01 7:51 UTC (permalink / raw)
To: Adam Kiepul; +Cc: 'Ralf Baechle', Fuxin Zhang, linux-mips
> If this is just to ensure the I Cache coherency for modified code
> then the following should be sufficient:
>
> cache Hit_Writeback_D, offset(base_register)
> cache Hit_Invalidate_I, offset(base_register)
>
> The ordering does matter however since the Hit_Invalidate_I makes
> sure the write buffer is flushed.
I'm probably jumping into the middle of something, sorry...
The MIPS32/MIPS64 release 2 architecture includes a useful instruction
SYNCI which does the whole job (repeat on each affected cache line)
and is legal in user mode; this will take a while to spread but I'd
recommend it as a model worth following.
So I hope that kernels will provide one function for "I've just
written instructions and now I want to execute them", and not export
the separate writeback-D/invalidate-I interface.
--
Dominic Sweetman
MIPS Technologies
dom@mips.com
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: RM7k cache_flush_sigtramp
2003-08-01 7:51 ` Dominic Sweetman
2003-08-01 7:51 ` Dominic Sweetman
@ 2003-08-01 9:26 ` Ralf Baechle
2003-08-01 14:18 ` Fuxin Zhang
2003-08-04 8:45 ` Dominic Sweetman
1 sibling, 2 replies; 22+ messages in thread
From: Ralf Baechle @ 2003-08-01 9:26 UTC (permalink / raw)
To: Dominic Sweetman; +Cc: Adam Kiepul, Fuxin Zhang, linux-mips
On Fri, Aug 01, 2003 at 08:51:39AM +0100, Dominic Sweetman wrote:
> The MIPS32/MIPS64 release 2 architecture includes a useful instruction
> SYNCI which does the whole job (repeat on each affected cache line)
> and is legal in user mode; this will take a while to spread but I'd
> recommend it as a model worth following.
> So I hope that kernels will provide one function for "I've just
> written instructions and now I want to execute them", and not export
> the separate writeback-D/invalidate-I interface.
Linux supports the traditional MIPS UNIX cacheflush(2) syscall through
a libc interface. Since I've not seen any other use for the call than
I/D-cache synchronization. I'd just make cacheflush(3) use SYNCI where
available (Or maybe one of the other vendor specific mechanisms ...) and
fallback to cacheflush(2) where available. Gcc would be another place
to teach about SYNCI for it's trampolines.
Ralf
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: RM7k cache_flush_sigtramp
2003-08-01 9:26 ` Ralf Baechle
@ 2003-08-01 14:18 ` Fuxin Zhang
2003-08-02 17:02 ` Ralf Baechle
2003-08-04 8:45 ` Dominic Sweetman
1 sibling, 1 reply; 22+ messages in thread
From: Fuxin Zhang @ 2003-08-01 14:18 UTC (permalink / raw)
To: Ralf Baechle; +Cc: Dominic Sweetman, Adam Kiepul, linux-mips
I just run a fresh new 2.4.21 kernel on my board, no luck. The problem
remains.
But I notice that my hardware may have some problems,especially with the
add-on
ide card. Keep headaching...
As to the discussion of SYNC, I can't help wondering whether the cache
management
should be totally hidden from programmers. People tends to write
"safetest" code because
of all kinds of brain-damage different hardware, which leads to
inefficient code. And this will
cancel out the potential speed benefit of simpler hardware. Also today's
hardware seems not
as expensive as it was before...
Ralf Baechle wrote:
>On Fri, Aug 01, 2003 at 08:51:39AM +0100, Dominic Sweetman wrote:
>
>
>
>>The MIPS32/MIPS64 release 2 architecture includes a useful instruction
>>SYNCI which does the whole job (repeat on each affected cache line)
>>and is legal in user mode; this will take a while to spread but I'd
>>recommend it as a model worth following.
>>
>>
>
>
>
>>So I hope that kernels will provide one function for "I've just
>>written instructions and now I want to execute them", and not export
>>the separate writeback-D/invalidate-I interface.
>>
>>
>
>Linux supports the traditional MIPS UNIX cacheflush(2) syscall through
>a libc interface. Since I've not seen any other use for the call than
>I/D-cache synchronization. I'd just make cacheflush(3) use SYNCI where
>available (Or maybe one of the other vendor specific mechanisms ...) and
>fallback to cacheflush(2) where available. Gcc would be another place
>to teach about SYNCI for it's trampolines.
>
> Ralf
>
>
>
>
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: RM7k cache_flush_sigtramp
@ 2003-08-01 15:42 Adam Kiepul
2003-08-04 3:38 ` Fuxin Zhang
2003-08-06 11:00 ` Fuxin Zhang
0 siblings, 2 replies; 22+ messages in thread
From: Adam Kiepul @ 2003-08-01 15:42 UTC (permalink / raw)
To: 'Fuxin Zhang', Ralf Baechle; +Cc: MAKE FUN PRANK CALLS
Hi Fuxin,
Could you please provide me with the _original_ Kernel code disassembly snippet around the point where your SYNC patch applies?
Also, can you check what RM7000 part revision is on your board? You can find it out by reading the PrID register.
I will check if there is an erratum that the code could trigger.
By the way, are you aware of any other ev64240 board that would exhibit the same behavior?
I would be quite careful drawing any conclusions at the moment since we can not preclude the possibility that it is simply a "bad CPU on the board" case. Please note that the SYNC instruction changes a lot in the manner things physically happen in the CPU so it can often mask off various problems, such as a bad part.
Thank you,
Adam
-----Original Message-----
From: Fuxin Zhang [mailto:fxzhang@ict.ac.cn]
Sent: Thursday, July 31, 2003 9:59 PM
To: Ralf Baechle
Cc: Adam Kiepul; MAKE FUN PRANK CALLS
Subject: Re: RM7k cache_flush_sigtramp
I am using a slightly modified 2.4.21-pre4,based on cvs of early this
month(?).
We have merged with latest cvs, I will run it and report the result tonight.
Ralf Baechle wrote:
>Adam,
>
>On Fri, Aug 01, 2003 at 08:40:14AM +0800, Fuxin Zhang wrote:
>
>
>
>>Current linux code does exactly this. But I was seeing all kinds of
>>faults occuring around the
>>sigreturn point on the stack without a sync? And a sync does greatly
>>improve the stablity.
>>
>>
>>
>>>The ordering does matter however since the Hit_Invalidate_I makes sure the
>>>write buffer is flushed.
>>>
>>>
>
>could there be an errata explaining Fuxin's findings?
>
>Fuxin, what version are you running?
>
> Ralf
>
>
>
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: RM7k cache_flush_sigtramp
2003-08-01 14:18 ` Fuxin Zhang
@ 2003-08-02 17:02 ` Ralf Baechle
0 siblings, 0 replies; 22+ messages in thread
From: Ralf Baechle @ 2003-08-02 17:02 UTC (permalink / raw)
To: Fuxin Zhang; +Cc: Dominic Sweetman, Adam Kiepul, linux-mips
On Fri, Aug 01, 2003 at 10:18:50PM +0800, Fuxin Zhang wrote:
> I just run a fresh new 2.4.21 kernel on my board, no luck. The problem
> remains. But I notice that my hardware may have some problems,
> especially with the add-on ide card. Keep headaching...
>
> As to the discussion of SYNC, I can't help wondering whether the cache
> management should be totally hidden from programmers. People tends to
> write "safetest" code because of all kinds of brain-damage different
> hardware, which leads to inefficient code. And this will cancel out the
> potential speed benefit of simpler hardware. Also today's hardware seems
> not as expensive as it was before...
Cache managment needs to be somehow hidden from programmers as well as
possible - the average programmer has no clue about how caches work.
We've come up with an API that hides the actual functioning of caches
pretty well for DMA devices, see Documentation/DMA-mapping.txt and in
2.6 also a more generalized version documented in Documentation/DMA-API.txt.
Ralf
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: RM7k cache_flush_sigtramp
2003-08-01 15:42 RM7k cache_flush_sigtramp Adam Kiepul
@ 2003-08-04 3:38 ` Fuxin Zhang
2003-08-06 11:00 ` Fuxin Zhang
1 sibling, 0 replies; 22+ messages in thread
From: Fuxin Zhang @ 2003-08-04 3:38 UTC (permalink / raw)
To: Adam Kiepul; +Cc: Ralf Baechle, MAKE FUN PRANK CALLS
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=gb18030; format=flowed, Size: 3201 bytes --]
Hi Adam,
My cpu PRID: 0x2732, runs at freq 133x2MHz
disassemble code before patch:
ffffffff8010f2fc <r4k_flush_cache_sigtramp>:
ffffffff8010f2fc: 3c03802c lui v1,0x802c
ffffffff8010f300: 246367ec addiu v1,v1,26604
ffffffff8010f304: 94620010 lhu v0,16(v1)
ffffffff8010f308: 94650000 lhu a1,0(v1)
ffffffff8010f30c: 00021023 negu v0,v0
ffffffff8010f310: 00821024 and v0,a0,v0
ffffffff8010f314: bc550000 cache 0x15,0(v0)
ffffffff8010f318: 00052823 negu a1,a1
ffffffff8010f31c: 00852024 and a0,a0,a1
ffffffff8010f320: bc900000 cache 0x10,0(a0)
ffffffff8010f324: 03e00008 jr ra
ffffffff8010f328: 00000000 nop
disassemble code after patch:
ffffffff8010ceb0 <r4k_flush_cache_sigtramp>:
ffffffff8010ceb0: 3c03802f lui v1,0x802f
ffffffff8010ceb4: 2463e3ac addiu v1,v1,-7252
ffffffff8010ceb8: 94620010 lhu v0,16(v1)
ffffffff8010cebc: 94650000 lhu a1,0(v1)
ffffffff8010cec0: 00021023 negu v0,v0
ffffffff8010cec4: 00821024 and v0,a0,v0
ffffffff8010cec8: bc550000 cache 0x15,0(v0)
ffffffff8010cecc: 0000000f sync
ffffffff8010ced0: 00052823 negu a1,a1
ffffffff8010ced4: 00852024 and a0,a0,a1
ffffffff8010ced8: bc900000 cache 0x10,0(a0)
ffffffff8010cedc: 03e00008 jr ra
ffffffff8010cee0: 00000000 nop
We do have more than one set of ev64240 and RM7k cpu£¬but it will take
some time for
me to get another one for test. I will tell you the result once i do it.
Thank you.
Adam Kiepul wrote:
>Hi Fuxin,
>
>Could you please provide me with the _original_ Kernel code disassembly snippet around the point where your SYNC patch applies?
>Also, can you check what RM7000 part revision is on your board? You can find it out by reading the PrID register.
>
>I will check if there is an erratum that the code could trigger.
>
>By the way, are you aware of any other ev64240 board that would exhibit the same behavior?
>
>I would be quite careful drawing any conclusions at the moment since we can not preclude the possibility that it is simply a "bad CPU on the board" case. Please note that the SYNC instruction changes a lot in the manner things physically happen in the CPU so it can often mask off various problems, such as a bad part.
>
>Thank you,
>
>Adam
>
>
>-----Original Message-----
>From: Fuxin Zhang [mailto:fxzhang@ict.ac.cn]
>Sent: Thursday, July 31, 2003 9:59 PM
>To: Ralf Baechle
>Cc: Adam Kiepul; MAKE FUN PRANK CALLS
>Subject: Re: RM7k cache_flush_sigtramp
>
>
>I am using a slightly modified 2.4.21-pre4,based on cvs of early this
>month(?).
>We have merged with latest cvs, I will run it and report the result tonight.
>
>
>Ralf Baechle wrote:
>
>
>
>>Adam,
>>
>>On Fri, Aug 01, 2003 at 08:40:14AM +0800, Fuxin Zhang wrote:
>>
>>
>>
>>
>>
>>>Current linux code does exactly this. But I was seeing all kinds of
>>>faults occuring around the
>>>sigreturn point on the stack without a sync? And a sync does greatly
>>>improve the stablity.
>>>
>>>
>>>
>>>
>>>
>>>>The ordering does matter however since the Hit_Invalidate_I makes sure the
>>>>write buffer is flushed.
>>>>
>>>>
>>>>
>>>>
>>could there be an errata explaining Fuxin's findings?
>>
>>Fuxin, what version are you running?
>>
>> Ralf
>>
>>
>>
>>
>>
>>
>
>
>
>
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: RM7k cache_flush_sigtramp
2003-08-01 9:26 ` Ralf Baechle
2003-08-01 14:18 ` Fuxin Zhang
@ 2003-08-04 8:45 ` Dominic Sweetman
2003-08-04 11:51 ` Maciej W. Rozycki
1 sibling, 1 reply; 22+ messages in thread
From: Dominic Sweetman @ 2003-08-04 8:45 UTC (permalink / raw)
To: Ralf Baechle; +Cc: Dominic Sweetman, Adam Kiepul, Fuxin Zhang, linux-mips
Ralf,
> Linux supports the traditional MIPS UNIX cacheflush(2) syscall through
> a libc interface. Since I've not seen any other use for the call than
> I/D-cache synchronization. I'd just make cacheflush(3) use SYNCI where
> available...
SYNCI just does what's required to execute code you just wrote: that's
a D-cache writeback and an I-cache invalidate. It doesn't invalidate
the D-cache afterwards, which is required by the definition of
cacheflush(3).
I think it would be better to provide cache manipulation calls defined
top-down (by their purpose); but so long as we are stuck with calls
which are defined as performing particular low-level actions, it's
surely dangerous to guess that we know what they are used for so we
can trim the functions accordingly...
--
Dominic Sweetman
MIPS Technologies
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: RM7k cache_flush_sigtramp
2003-08-04 8:45 ` Dominic Sweetman
@ 2003-08-04 11:51 ` Maciej W. Rozycki
0 siblings, 0 replies; 22+ messages in thread
From: Maciej W. Rozycki @ 2003-08-04 11:51 UTC (permalink / raw)
To: Dominic Sweetman; +Cc: Ralf Baechle, Adam Kiepul, Fuxin Zhang, linux-mips
Dominic,
> I think it would be better to provide cache manipulation calls defined
> top-down (by their purpose); but so long as we are stuck with calls
> which are defined as performing particular low-level actions, it's
> surely dangerous to guess that we know what they are used for so we
> can trim the functions accordingly...
The API is not cast in stone -- if there's a justifiable benefit,
appropriate fuctions can be added; either completely new ones (possibly
inlined) or as an extension to cacheflush() (which still has 30 bits
freely available).
Maciej
--
+ Maciej W. Rozycki, Technical University of Gdansk, Poland +
+--------------------------------------------------------------+
+ e-mail: macro@ds2.pg.gda.pl, PGP key available +
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: RM7k cache_flush_sigtramp
2003-08-01 15:42 RM7k cache_flush_sigtramp Adam Kiepul
2003-08-04 3:38 ` Fuxin Zhang
@ 2003-08-06 11:00 ` Fuxin Zhang
2003-08-06 11:55 ` Ralf Baechle
1 sibling, 1 reply; 22+ messages in thread
From: Fuxin Zhang @ 2003-08-06 11:00 UTC (permalink / raw)
To: Adam Kiepul; +Cc: Ralf Baechle, MAKE FUN PRANK CALLS
[-- Attachment #1: Type: text/plain, Size: 4266 bytes --]
hi,
These days I have performed more experiments on our ev64240 board.
Now it seems I get at least two problems: sigtramp flush and L3 cache.
Let me descripe the phenomemena first.
1. fsck /dev/hda4(a 10G partition of 40G ide disk on a pci add-on
card use
intel piix4 chip) frequently fail with oops in various place:
__remove_inode_queue, free_buffers, vmscan:359 etc.
2. occasionally other apps may fail with segmentation fault or
bus error.
3. xwindow system is extremely unstable,both the applications and
the
Xserver may fail with sigill/sigsegv/sigbus etc.
To address the problems, I modified arch/mips/signal.c to let kernel
dump core
unconditionally(even if there are use handler installed) for
sigill/sigsegv/sigbus.
By this way I get many core files for XFree86,then I find that they all
look quite
similiar--all around the point of kernel generated sigreturn code(Two
example are
attached). Days ago i added a 'sync' after writeback and the situation
was much better.
But then i still see this kinds of failure even with the 'sync'. I have
to go further back
to use 'Writeback_SD', so far no more such fault. But just as Adam
pointed out,it may
just mask over another error. I have tried to add code in
r4k_flush_sigtramp and
sigreturn,and when xserver fails,I do observe that there are flush for
the faulting point,
but no sigreturn executed. So it is at my wit's end:(. Maybe some
complex schedule or
reentry problem? Or even a potential bug of context management(e.g.,we
are using the
other's stack)?
Using Writeback_SD only help xserver problem, the other problems look
like
cache related. So I try to run with L3 cache disabled. That helps
greatly, no oops
now. With a little tweak on ide code,the 'lost interrupt' problem seems
gone too.
But with only L3 disabled, the Xserver problem remains.
I am doing stress test now. Hope it won't give me more surprise.
And here I have a question for Mr. Adam: original linux code use
'Writeback_Inv_D"
and "Hit_Invalidate_I",not "Writeback_D" and "Hit_Invalidate_I",could it
lead to the
problem?
BTW:
a silly question: how can i make my email show up pretier? I find
that the mailing list
often break my lines very badly. I feel guilty for that:) I am using
mozilla composer,the
original linebreaks are manually inserted(hit enter when i feel it is
long enough).
Thank you for any help.
Adam Kiepul wrote:
>Hi Fuxin,
>
>Could you please provide me with the _original_ Kernel code disassembly snippet around the point where your SYNC patch applies?
>Also, can you check what RM7000 part revision is on your board? You can find it out by reading the PrID register.
>
>I will check if there is an erratum that the code could trigger.
>
>By the way, are you aware of any other ev64240 board that would exhibit the same behavior?
>
>I would be quite careful drawing any conclusions at the moment since we can not preclude the possibility that it is simply a "bad CPU on the board" case. Please note that the SYNC instruction changes a lot in the manner things physically happen in the CPU so it can often mask off various problems, such as a bad part.
>
>Thank you,
>
>Adam
>
>
>-----Original Message-----
>From: Fuxin Zhang [mailto:fxzhang@ict.ac.cn]
>Sent: Thursday, July 31, 2003 9:59 PM
>To: Ralf Baechle
>Cc: Adam Kiepul; MAKE FUN PRANK CALLS
>Subject: Re: RM7k cache_flush_sigtramp
>
>
>I am using a slightly modified 2.4.21-pre4,based on cvs of early this
>month(?).
>We have merged with latest cvs, I will run it and report the result tonight.
>
>
>Ralf Baechle wrote:
>
>
>
>>Adam,
>>
>>On Fri, Aug 01, 2003 at 08:40:14AM +0800, Fuxin Zhang wrote:
>>
>>
>>
>>
>>
>>>Current linux code does exactly this. But I was seeing all kinds of
>>>faults occuring around the
>>>sigreturn point on the stack without a sync? And a sync does greatly
>>>improve the stablity.
>>>
>>>
>>>
>>>
>>>
>>>>The ordering does matter however since the Hit_Invalidate_I makes sure the
>>>>write buffer is flushed.
>>>>
>>>>
>>>>
>>>>
>>could there be an errata explaining Fuxin's findings?
>>
>>Fuxin, what version are you running?
>>
>> Ralf
>>
>>
>>
>>
>>
>>
>
>
>
>
>
[-- Attachment #2: out --]
[-- Type: text/plain, Size: 3968 bytes --]
GNU gdb 2002-04-01-cvs
Copyright 2002 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "mipsel-linux"...(no debugging symbols found)...
Core was generated by `/usr/bin/X11/X -dpi 100 -nolisten tcp'.
Program terminated with signal 4, Illegal instruction.
Reading symbols from /usr/lib/libz.so.1...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libz.so.1
Reading symbols from /lib/libm.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib/libm.so.6
Reading symbols from /lib/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib/libc.so.6
Reading symbols from /lib/ld.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib/ld.so.1
Reading symbols from /lib/libnss_files.so.2...(no debugging symbols found)...
done.
Loaded symbols for /lib/libnss_files.so.2
GDB is unable to find the start of the function at 0x7fff7600
and thus can't determine the size of that function's stack frame.
This means that GDB may be unable to access that stack frame, or
the frames below it.
This problem is most likely caused by an invalid program counter or
stack pointer.
However, if you think GDB should simply search farther back
from 0x7fff7600 for code which looks like the beginning of a
function, you can increase the range of the search using the `set
heuristic-fence-post' command.
#0 0x7fff7600 in ?? ()
(gdb) where
#0 0x7fff7600 in ?? ()
(gdb) disass 0x7fff7580 0x7fff7680
Dump of assembler code from 0x7fff7580 to 0x7fff7680:
0x7fff7580: nop
0x7fff7584: nop
0x7fff7588: nop
0x7fff758c: nop
0x7fff7590: nop
0x7fff7594: nop
0x7fff7598: nop
0x7fff759c: nop
0x7fff75a0: nop
0x7fff75a4: nop
0x7fff75a8: nop
0x7fff75ac: nop
0x7fff75b0: nop
0x7fff75b4: nop
0x7fff75b8: nop
0x7fff75bc: nop
0x7fff75c0: nop
0x7fff75c4: nop
0x7fff75c8: nop
0x7fff75cc: nop
0x7fff75d0: nop
0x7fff75d4: beq at,t3,0x80003ef8
0x7fff75d8: sllv zero,zero,zero
0x7fff75dc: 0xe
0x7fff75e0: beq zero,gp,0x7ffef244
0x7fff75e4: beq zero,t0,0x800012e8
0x7fff75e8: 0x7fff7600
0x7fff75ec: beq zero,t2,0x7ffd98b0
0x7fff75f0: sd ra,-1(ra)
0x7fff75f4: sd ra,-1(ra)
0x7fff75f8: slti sp,s6,-25040
0x7fff75fc: beq zero,t2,0x7ffd9cc0
0x7fff7600: li v0,4119
0x7fff7604: syscall
0x7fff7608: 0x12c
0x7fff760c: lb a0,-19437(zero)
0x7fff7610: slti s1,s6,17812
0x7fff7614: nop
0x7fff7618: nop
0x7fff761c: nop
0x7fff7620: 0xcf9210
0x7fff7624: nop
0x7fff7628: mfhi zero
0x7fff762c: nop
0x7fff7630: beq zero,at,0x8000caa4
0x7fff7634: nop
0x7fff7638: 0xe
0x7fff763c: nop
0x7fff7640: slti v0,k1,28680
0x7fff7644: nop
0x7fff7648: 0x1228
0x7fff764c: nop
0x7fff7650: nop
0x7fff7654: nop
0x7fff7658: multu zero,zero
0x7fff765c: nop
0x7fff7660: nop
0x7fff7664: nop
0x7fff7668: slti t9,s7,17756
0x7fff766c: nop
0x7fff7670: beq at,t3,0x7fff6f04
0x7fff7674: nop
0x7fff7678: 0x12c
0x7fff767c: nop
End of assembler dump.
(gdb) info regs
(gdb) info regs\b \b
zero at v0 v1 a0 a1 a2 a3
R0 00000000 b004b400 ffffffff ffffffff 00000001 7fff73f8 00000000 00000000
t0 t1 t2 t3 t4 t5 t6 t7
R8 0000b400 00000000 00000000 00000000 00000000 822b2880 822b2900 00000000
s0 s1 s2 s3 s4 s5 s6 s7
R16 00000000 102b3248 00000004 0000000e 101cdf18 102b1820 00000001 00000000
t8 t9 k0 k1 gp sp s8 ra
R24 00000000 006c86ec 00000000 00000000 10082740 7fff75f0 7fff7d08 7fff7600
sr lo hi bad cause pc
a004b413 00000002 00000000 8009c6a0 00000028 7fff7600
fsr fir fp
00800004 00000000 00000000
(gdb) quit
[-- Attachment #3: out1 --]
[-- Type: text/plain, Size: 3412 bytes --]
GNU gdb 2002-04-01-cvs
Copyright 2002 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "mipsel-linux"...(no debugging symbols found)...
Core was generated by `/bin/sh /usr/bin/X11/startx'.
Program terminated with signal 4, Illegal instruction.
GDB is unable to find the start of the function at 0x7fff75b8
and thus can't determine the size of that function's stack frame.
This means that GDB may be unable to access that stack frame, or
the frames below it.
This problem is most likely caused by an invalid program counter or
stack pointer.
However, if you think GDB should simply search farther back
from 0x7fff75b8 for code which looks like the beginning of a
function, you can increase the range of the search using the `set
heuristic-fence-post' command.
#0 0x7fff75b8 in ?? ()
(gdb) info reg
zero at v0 v1 a0 a1 a2 a3
R0 00000000 2ad918f0 2ad918f0 0000000a 00000012 7fff7538 00000001 00000001
t0 t1 t2 t3 t4 t5 t6 t7
R8 0000000a 2aca6394 00000000 00000004 00000000 00000000 00000000 07200720
s0 s1 s2 s3 s4 s5 s6 s7
R16 00000000 00000004 00000080 7fff7878 00000003 ffffffff 1000f0f8 00000001
t8 t9 k0 k1 gp sp s8 ra
R24 00000000 00000000 00000000 00000000 1000d880 7fff7590 00000003 7fff75a0
sr lo hi bad cause pc
a004f413 000001b0 00000000 8009c6a0 80000028 7fff75b8
fsr fir fp
00000000 00000000 00000000
(gdb) disass 0x7fff7500 0x7fff7600
Dump of assembler code from 0x7fff7500 to 0x7fff7600:
0x7fff7500: 0xc2009d
0x7fff7504: 0x10000e8
0x7fff7508: 0x11a0110
0x7fff750c: 0x990121
0x7fff7510: slti t9,s6,32304
0x7fff7514: tltu a0,t9,0x2
0x7fff7518: slti t9,s6,32304
0x7fff751c: 0x442c88
0x7fff7520: nop
0x7fff7524: nop
0x7fff7528: nop
0x7fff752c: nop
0x7fff7530: b 0x7ffed734
0x7fff7534: nop
0x7fff7538: nop
0x7fff753c: slti t8,s6,-8108
0x7fff7540: nop
0x7fff7544: sllv zero,zero,zero
0x7fff7548: sll zero,zero,0x2
0x7fff754c: 0x7fff7878
0x7fff7550: sra zero,zero,0x0
0x7fff7554: sd ra,-1(ra)
0x7fff7558: b 0x7fff393c
0x7fff755c: b 0x7ffed760
0x7fff7560: teq v0,a0,0xa9
0x7fff7564: nop
0x7fff7568: nop
0x7fff756c: nop
0x7fff7570: nop
0x7fff7574: nop
0x7fff7578: b 0x7ffed77c
0x7fff757c: nop
0x7fff7580: nop
0x7fff7584: b 0x7ffed788
0x7fff7588: 0x7fff75a0
0x7fff758c: 0x1
0x7fff7590: b 0x7fff2174
0x7fff7594: 0x7fff7804
0x7fff7598: slti t9,s6,32304
0x7fff759c: 0x475718
0x7fff75a0: li v0,4119
0x7fff75a4: syscall
0x7fff75a8: slti t8,s6,-8108
0x7fff75ac: lb a0,-3053(zero)
0x7fff75b0: slti t5,s6,9620
0x7fff75b4: nop
0x7fff75b8: nop
0x7fff75bc: nop
0x7fff75c0: b 0x7fff8cb4
0x7fff75c4: nop
0x7fff75c8: sllv zero,zero,zero
0x7fff75cc: nop
0x7fff75d0: nop
0x7fff75d4: nop
0x7fff75d8: sra zero,zero,0x0
0x7fff75dc: nop
0x7fff75e0: 0x7fff7878
0x7fff75e4: nop
0x7fff75e8: sll zero,zero,0x2
0x7fff75ec: nop
0x7fff75f0: 0x1
0x7fff75f4: nop
0x7fff75f8: mult zero,zero
0x7fff75fc: nop
End of assembler dump.
(gdb) quit
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: RM7k cache_flush_sigtramp
2003-08-06 11:00 ` Fuxin Zhang
@ 2003-08-06 11:55 ` Ralf Baechle
2003-08-06 12:52 ` Fuxin Zhang
0 siblings, 1 reply; 22+ messages in thread
From: Ralf Baechle @ 2003-08-06 11:55 UTC (permalink / raw)
To: Fuxin Zhang; +Cc: Adam Kiepul, MAKE FUN PRANK CALLS
On Wed, Aug 06, 2003 at 07:00:07PM +0800, Fuxin Zhang wrote:
> And here I have a question for Mr. Adam: original linux code use
> 'Writeback_Inv_D"
> and "Hit_Invalidate_I",not "Writeback_D" and "Hit_Invalidate_I",could it
> lead to the
> problem?
No. To synchronize the D-cache and I-cache it's irrelevant if you
invalidate the D-cache or not.
> BTW:
> a silly question: how can i make my email show up pretier? I find
> that the mailing list
> often break my lines very badly. I feel guilty for that:) I am using
> mozilla composer,the
> original linebreaks are manually inserted(hit enter when i feel it is
> long enough).
Format your email with hard breaks to about 75 columns. 75 columns
because god made vt100 with 80 columns so that leaves a bit of space for
quoting your mail nicely.
Now for your register dumps and information:
> (gdb) info reg
[...]
> t8 t9 k0 k1 gp sp s8 ra
> R24 00000000 00000000 00000000 00000000 1000d880 7fff7590 00000003 7fff75a0
> sr lo hi bad cause pc
> a004f413 000001b0 00000000 8009c6a0 80000028 7fff75b8
[...]
> 0x7fff75a0: li v0,4119
> 0x7fff75a4: syscall
So the pc is pointing just after the trampoline which suspiciously looks
like the return of an old bug. Could your application be doing something
unusual such as forking from a signal handler or similar? The scenario
is about
- kernel installs signal trampoline on stack
- kernel forks. Now the signal trampoline installed in the first step
resides on a copy-on-write page.
- newly created process touches the cow page, thereby resulting in
breaking of the cow page. Now parent and child have their own copy
of the page. BUT: flush_cache_page() doesn't properly flush this page.
- Parent executes again on the copy of the page for which caches have
not been flushed proplerly in the previous step, thereby failing to
execute the trampoline - crash.
Ralf
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: RM7k cache_flush_sigtramp
2003-08-06 11:55 ` Ralf Baechle
@ 2003-08-06 12:52 ` Fuxin Zhang
2003-08-06 14:45 ` Ralf Baechle
0 siblings, 1 reply; 22+ messages in thread
From: Fuxin Zhang @ 2003-08-06 12:52 UTC (permalink / raw)
To: Ralf Baechle; +Cc: Adam Kiepul, MAKE FUN PRANK CALLS
Ralf Baechle wrote:
>On Wed, Aug 06, 2003 at 07:00:07PM +0800, Fuxin Zhang wrote:
>
>
>
>> And here I have a question for Mr. Adam: original linux code use
>>'Writeback_Inv_D"
>>and "Hit_Invalidate_I",not "Writeback_D" and "Hit_Invalidate_I",could it
>>lead to the
>>problem?
>>
>>
>
>No. To synchronize the D-cache and I-cache it's irrelevant if you
>invalidate the D-cache or not.
>
>
I think so. Just in case the hardware is doing something strange:)
>
>
>>BTW:
>> a silly question: how can i make my email show up pretier? I find
>>that the mailing list
>>often break my lines very badly. I feel guilty for that:) I am using
>>mozilla composer,the
>>original linebreaks are manually inserted(hit enter when i feel it is
>>long enough).
>>
>>
>
>Format your email with hard breaks to about 75 columns. 75 columns
>because god made vt100 with 80 columns so that leaves a bit of space for
>quoting your mail nicely.
>
Thanks:)
>
>Now for your register dumps and information:
>
>
>> sr lo hi bad cause pc
>> a004f413 000001b0 00000000 8009c6a0 80000028 7fff75b8
>>
>>
>[...]
>
>
>>0x7fff75a0: li v0,4119
>>0x7fff75a4: syscall
>>
>>
>
>So the pc is pointing just after the trampoline which suspiciously looks
>like the return of an old bug. Could your application be doing something
>unusual such as forking from a signal handler or similar? The scenario
>
I am not sure. It is stardard X distribution from debian-woody. Fairly
easy to reproduce,just move the mouse
around and click here and there then it would die. Will check this
later,but I think such a giant as Xserver
won't fork frequently.
>is about
>
> - kernel installs signal trampoline on stack
> - kernel forks. Now the signal trampoline installed in the first step
> resides on a copy-on-write page.
> - newly created process touches the cow page, thereby resulting in
> breaking of the cow page. Now parent and child have their own copy
> of the page. BUT: flush_cache_page() doesn't properly flush this page
>
>
> - Parent executes again on the copy of the page for which caches have
>
If the new process touch the cow page first,shouldn't it get a new page
and leave the original page for parent?
If so,the parent should be able to see the trampoline content from
icache anyway(either L2 or memory should
have the value),though the child may not?
> not been flushed proplerly in the previous step, thereby failing to
> execute the trampoline - crash.
>
RM7000 has 16k 4-way set-associated primary caches,which are supposed to
have no cache aliasing problem
Bad news:
oops again:( while true; do fsck -y -f /dev/hda4 ; done
after about 5 succeeded run.
So still some problems lurking somewhere.
It seems I have to switch some hardware...
>
> Ralf
>
>
>
>
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: RM7k cache_flush_sigtramp
2003-08-06 12:52 ` Fuxin Zhang
@ 2003-08-06 14:45 ` Ralf Baechle
2003-08-06 15:04 ` Fuxin Zhang
0 siblings, 1 reply; 22+ messages in thread
From: Ralf Baechle @ 2003-08-06 14:45 UTC (permalink / raw)
To: Fuxin Zhang; +Cc: Adam Kiepul, MAKE FUN PRANK CALLS
On Wed, Aug 06, 2003 at 08:52:46PM +0800, Fuxin Zhang wrote:
> I am not sure. It is stardard X distribution from debian-woody. Fairly
> easy to reproduce,just move the mouse
> around and click here and there then it would die. Will check this
> later,but I think such a giant as Xserver won't fork frequently.
The scenario I was describing was just how we did originally discover the
bug. Supposedly that was fixed but your register dump and dissassembly
show the exact fingerprint of that old problem, so I though I should
describe it in the hope it's going to help you.
> If the new process touch the cow page first,shouldn't it get a new page
> and leave the original page for parent?
> If so,the parent should be able to see the trampoline content from
> icache anyway(either L2 or memory should
> have the value),though the child may not?
RM7000 has a physically indexed cache. That means if the copy of the
page wasn't explicitly or implicitly written back to L2 the process
whichever ends up with the copy of the page might fetch stale instructions
from memory - boom.
> > not been flushed proplerly in the previous step, thereby failing to
> > execute the trampoline - crash.
> >
> RM7000 has 16k 4-way set-associated primary caches,which are supposed to
> have no cache aliasing problem
The described scenario is not an aliasing problem; it's the case where the
copy of the cow page hasn't properly been flushed at all. When we
isolated the bug was that neither flush_page_to_ram() nor flush_cache_page()
were flushing the cache. I suspect your case must be something fairly
similar.
Ralf
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: RM7k cache_flush_sigtramp
2003-08-06 14:45 ` Ralf Baechle
@ 2003-08-06 15:04 ` Fuxin Zhang
2003-08-06 22:30 ` Ralf Baechle
0 siblings, 1 reply; 22+ messages in thread
From: Fuxin Zhang @ 2003-08-06 15:04 UTC (permalink / raw)
To: Ralf Baechle; +Cc: Adam Kiepul, MAKE FUN PRANK CALLS
Ralf Baechle wrote:
>>If the new process touch the cow page first,shouldn't it get a new page
>>and leave the original page for parent?
>>If so,the parent should be able to see the trampoline content from
>>icache anyway(either L2 or memory should
>>have the value),though the child may not?
>>
>>
>
>RM7000 has a physically indexed cache. That means if the copy of the
>page wasn't explicitly or implicitly written back to L2 the process
>whichever ends up with the copy of the page might fetch stale instructions
>from memory - boom.
>
>
>
>>> not been flushed proplerly in the previous step, thereby failing to
>>> execute the trampoline - crash.
>>>
>>>
>>>
>>RM7000 has 16k 4-way set-associated primary caches,which are supposed to
>>have no cache aliasing problem
>>
>>
>
>The described scenario is not an aliasing problem; it's the case where the
>copy of the cow page hasn't properly been flushed at all. When we
>isolated the bug was that neither flush_page_to_ram() nor flush_cache_page()
>were flushing the cache. I suspect your case must be something fairly
>
>
After cache rewrite,flush_page_to_ram is null; and in this case
flush_cache_page
do nothing for a stack page. (It flushes only when has_dc_aliases or
exec set).
So the one use the new copy will have problem ?! Am I missing
something?
Thank you very much, great Ralf:).
>similar.
>
> Ralf
>
>
>
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: RM7k cache_flush_sigtramp
2003-08-06 15:04 ` Fuxin Zhang
@ 2003-08-06 22:30 ` Ralf Baechle
0 siblings, 0 replies; 22+ messages in thread
From: Ralf Baechle @ 2003-08-06 22:30 UTC (permalink / raw)
To: Fuxin Zhang; +Cc: Adam Kiepul, MAKE FUN PRANK CALLS
On Wed, Aug 06, 2003 at 11:04:19PM +0800, Fuxin Zhang wrote:
> After cache rewrite,flush_page_to_ram is null; and in this case
> flush_cache_page
> do nothing for a stack page. (It flushes only when has_dc_aliases or
> exec set).
> So the one use the new copy will have problem ?! Am I missing
> something?
The stack page contains the trampoline so it must be marked executable,
so on an RM7000 flush_dcache_page must flush both the D-cache and
I-cache.
Ralf
^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2003-08-06 22:30 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-08-01 15:42 RM7k cache_flush_sigtramp Adam Kiepul
2003-08-04 3:38 ` Fuxin Zhang
2003-08-06 11:00 ` Fuxin Zhang
2003-08-06 11:55 ` Ralf Baechle
2003-08-06 12:52 ` Fuxin Zhang
2003-08-06 14:45 ` Ralf Baechle
2003-08-06 15:04 ` Fuxin Zhang
2003-08-06 22:30 ` Ralf Baechle
-- strict thread matches above, loose matches on Subject: below --
2003-07-31 16:50 Adam Kiepul
2003-08-01 0:40 ` Fuxin Zhang
2003-08-01 3:01 ` Ralf Baechle
2003-08-01 4:59 ` Fuxin Zhang
2003-08-01 7:51 ` Dominic Sweetman
2003-08-01 7:51 ` Dominic Sweetman
2003-08-01 9:26 ` Ralf Baechle
2003-08-01 14:18 ` Fuxin Zhang
2003-08-02 17:02 ` Ralf Baechle
2003-08-04 8:45 ` Dominic Sweetman
2003-08-04 11:51 ` Maciej W. Rozycki
2003-07-31 1:56 Fuxin Zhang
2003-07-31 11:46 ` Ralf Baechle
2003-07-31 12:57 ` Fuxin Zhang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox