linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Tom Rini <trini@kernel.crashing.org>
To: Marcelo Tosatti <marcelo.tosatti@cyclades.com>
Cc: linuxppc-embedded@ozlabs.org, Dan Malek <dan@embeddedalley.com>,
	Joakim Tjernlund <joakim.tjernlund@transmode.se>,
	gtolstolytkin@ru.mvista.com
Subject: Re: [PATCH 2.6.14] mm: 8xx MM fix for
Date: Mon, 7 Nov 2005 08:51:47 -0700	[thread overview]
Message-ID: <20051107155147.GC3839@smtp.west.cox.net> (raw)
In-Reply-To: <20051107101618.GA15522@logos.cnet>

On Mon, Nov 07, 2005 at 08:16:18AM -0200, Marcelo Tosatti wrote:
> Joakim!
> 
> On Mon, Nov 07, 2005 at 03:32:52PM +0100, Joakim Tjernlund wrote:
> > Hi Marcelo
> > 
> > [SNIP] 
> > > The root of the problem are the changes against the 8xx TLB 
> > > handlers introduced
> > > during v2.6. What happens is the TLBMiss handlers load the 
> > > zeroed pte into
> > > the TLB, causing the TLBError handler to be invoked (thats 
> > > two TLB faults per 
> > > pagefault), which then jumps to the generic MM code to setup the pte.
> > > 
> > > The bug is that the zeroed TLB is not invalidated (the same reason
> > > for the "dcbst" misbehaviour), resulting in infinite TLBError faults.
> > > 
> > > Dan, I wonder why we just don't go back to v2.4 behaviour.
> > 
> > This is one reason why it is the way it is:
> > http://ozlabs.org/pipermail/linuxppc-embedded/2005-January/016382.html
> > This details are little fuzzy ATM, but I think the reason for the
> > current
> > impl. was only that it was less intrusive to impl.
> 
> Ah, I see. I wonder if the bug is processor specific: we don't have such
> changes in our v2.4 tree and never experienced such problem.
> 
> It should be pretty easy to hit it right? (instruction pagefaults should
> fail).
> 
> Grigori, Tom, can you enlight us about the issue on the URL above. How
> can it be triggered?

So after looking at the code in 2.6.14 and current git, I think the
above URL isn't relevant, unless there was a change I missed (which
could totally be possible) that reverted the patch there and fixed that
issue in a different manner.  But since I didn't figure that out until I
had finished researching it again:

Switching hats for a minute, this came from a bug a customer of
MontaVista found, so I can't give out the testcase :(

To repeat what Joakim said back then:
"I think I have figured this out. The first TLB misses that happen at
app startup is Data TLB misses. These will then hit the NULL L1 entry
and end up in do_page_fault() which will populate the L1 entry. But when
you have a very large app that spans more than one L1 entry (16 MB I
think) it may happen that you will have I-TLB Miss first one of the L1
entrys which will make the I-TLB handler bail out to do_page_fault() and
the app craches(SEGV)."

Looking at the patch again, what I don't see is why I talk about fudging
I-TLB Miss at 0x400 when it's I-TLB Error we fudge at being there, but
then get hung up that there can be a slight diff between the two ("This
is because we check bit 4 of SRR1 in both cases, but in the case of an
I-TLB Miss, this bit is always set, and it only indicates a protection
fault on an I-TLB Error.") so instead of 0x1300 jumping to the handler
at 0x400, we treat it like a regular exception so we know where we came
from, and perhaps missed fixing a case somewhere?

-- 
Tom Rini
http://gate.crashing.org/~trini/

  reply	other threads:[~2005-11-07 16:37 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-11-07 14:32 [PATCH 2.6.14] mm: 8xx MM fix for Joakim Tjernlund
2005-11-07 10:16 ` Marcelo Tosatti
2005-11-07 15:51   ` Tom Rini [this message]
2005-11-07 16:02 ` Dan Malek
  -- strict thread matches above, loose matches on Subject: below --
2005-11-30 17:34 Joakim Tjernlund
2005-11-07 18:37 Joakim Tjernlund
2005-11-12 19:28 ` Marcelo Tosatti
2005-11-13 12:47   ` Joakim Tjernlund
2005-11-16  8:39     ` Marcelo Tosatti
2005-11-07 18:14 Joakim Tjernlund
2005-11-07 18:22 ` Tom Rini
2005-11-08  0:46   ` Dan Malek
2005-11-07 15:44 Joakim Tjernlund
2005-11-07 11:12 ` Marcelo Tosatti
2005-10-30 20:03 Pantelis Antoniou
2005-10-30 21:16 ` Benjamin Herrenschmidt
2005-11-01 17:25 ` Marcelo Tosatti
2005-11-01 22:55   ` Pantelis Antoniou
2005-11-02  9:50     ` Marcelo Tosatti
2005-11-07  8:44 ` Marcelo Tosatti
2005-11-07 14:35   ` Dan Malek
2005-11-07 10:27     ` Marcelo Tosatti
2005-11-07 14:39   ` Pantelis Antoniou
2005-11-07 14:58   ` David Jander
2005-11-07 20:39   ` Benjamin Herrenschmidt
2005-11-07 17:02     ` Marcelo Tosatti
2005-11-07 20:50     ` Pantelis Antoniou
2005-11-08  0:44       ` Dan Malek
2005-11-09 12:04     ` Marcelo Tosatti
2005-11-10  7:48       ` David Jander
2005-11-10  8:18         ` David Jander

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20051107155147.GC3839@smtp.west.cox.net \
    --to=trini@kernel.crashing.org \
    --cc=dan@embeddedalley.com \
    --cc=gtolstolytkin@ru.mvista.com \
    --cc=joakim.tjernlund@transmode.se \
    --cc=linuxppc-embedded@ozlabs.org \
    --cc=marcelo.tosatti@cyclades.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).