From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <3E81F701.70307@slac.stanford.edu> Date: Wed, 26 Mar 2003 10:52:49 -0800 From: Till Straumann MIME-Version: 1.0 To: joakim.tjernlund@lumentis.se Cc: Dan Malek , Tom Rini , "Linuxppc-Embedded@Lists. Linuxppc. Org" Subject: Re: dcbz works on 862 everywhere! References: <3E810DA7.1060605@slac.stanford.edu> <3E81C4B3.2080006@embeddededge.com> Content-Type: text/plain; charset=us-ascii; format=flowed Sender: owner-linuxppc-embedded@lists.linuxppc.org List-Id: Jocke. There's of course a trivial test to check Dan's suspicion that it might just be a lucky sequence of TLBMiss-TLBError exceptions helping you out. Just load some bogus values (a la 0xdeadbeef) into MD_EPN and DAR shortly prior to returning from the TLBMiss exception (but after walking the tables, of course). Thereby you make sure that any 'leftover' values seen by a subsequent TLBError are invalidated. -- Till Dan Malek wrote: > Till Straumann wrote: > >> I found that 'dcbz' (while failing to set DAR) >> indeed sets MD_EPN correctly. Hence, Jocke's fix >> (copy EPN[0:19]->DAR) would handle that. > > > > After sleeping for a couple of days and consuming large > amounts of medicine to cure a cold, I think I understand > why copying these bits around seems to "fix" problems. > > It's all related to the sequence of TLB miss/error exceptions > that I had been describing all along. The first thing that > is going to most likely happen is you will get a TLB miss to > load a PTE into the TLB. It will be marked valid but not > dirty (not writable). Immediately upon performing the rfi > you will get a TLB Error to handle the dirty PTE update. > By copying the bits from MD_EPN to the DAR in the miss handler, > the Error handler will have at least a 4K boundary aligned DAR > and it will execute correctly to update the dirty state. At > this point, it will appear to "work" properly (even though > it is likely the dcbz didn't execute) because the system will > at least keep running (for a while). > > If you have a situation where you get a TLB Error without > a matching TLB miss (very rare, but they can happen as the > result of swapping, copy on write, certain other page table > updates), then you are hosed. The DAR will contain some information > from a previous exception, we will likely end up on a "hung" > system continually taking TLB Error exceptions because we > can't fix them properly. This is basically what happens > without the bit copying "fix". > > >> My older idea (fixing up MD_EPN and DAR based >> on the faulting instruction opcode and the involved >> GPR contents) should work even if we have neither >> a valid MD_EPN nor DAR. > > > All of the TLB exception handlers must have minimal instructions. > The ones in Linux are too big already. The very little you would > gain from making a dcbz/dcbt work correctly would be lost many, > many, many times over in a more complex TLB exception handler. > > Copying bits from MD_EPN to DAR doesn't set the DAR "correctly", > it only gives you the page boundary. This is going to further > confuse debuggers or signal handlers if you actually have an > addressing bug that is detected by one of these instructions. > > The only update I would like to see to TLB exception handlers is > the removal of code due to streamlining of the page table organization. > > Thanks. > > > -- Dan > ** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/