From: Tom Rini <trini@kernel.crashing.org>
To: Marcelo Tosatti <marcelo.tosatti@cyclades.com>
Cc: linuxppc-embedded@ozlabs.org, Dan Malek <dan@embeddedalley.com>,
Joakim Tjernlund <joakim.tjernlund@transmode.se>,
gtolstolytkin@ru.mvista.com
Subject: Re: [PATCH 2.6.14] mm: 8xx MM fix for
Date: Mon, 7 Nov 2005 08:51:47 -0700 [thread overview]
Message-ID: <20051107155147.GC3839@smtp.west.cox.net> (raw)
In-Reply-To: <20051107101618.GA15522@logos.cnet>
On Mon, Nov 07, 2005 at 08:16:18AM -0200, Marcelo Tosatti wrote:
> Joakim!
>
> On Mon, Nov 07, 2005 at 03:32:52PM +0100, Joakim Tjernlund wrote:
> > Hi Marcelo
> >
> > [SNIP]
> > > The root of the problem are the changes against the 8xx TLB
> > > handlers introduced
> > > during v2.6. What happens is the TLBMiss handlers load the
> > > zeroed pte into
> > > the TLB, causing the TLBError handler to be invoked (thats
> > > two TLB faults per
> > > pagefault), which then jumps to the generic MM code to setup the pte.
> > >
> > > The bug is that the zeroed TLB is not invalidated (the same reason
> > > for the "dcbst" misbehaviour), resulting in infinite TLBError faults.
> > >
> > > Dan, I wonder why we just don't go back to v2.4 behaviour.
> >
> > This is one reason why it is the way it is:
> > http://ozlabs.org/pipermail/linuxppc-embedded/2005-January/016382.html
> > This details are little fuzzy ATM, but I think the reason for the
> > current
> > impl. was only that it was less intrusive to impl.
>
> Ah, I see. I wonder if the bug is processor specific: we don't have such
> changes in our v2.4 tree and never experienced such problem.
>
> It should be pretty easy to hit it right? (instruction pagefaults should
> fail).
>
> Grigori, Tom, can you enlight us about the issue on the URL above. How
> can it be triggered?
So after looking at the code in 2.6.14 and current git, I think the
above URL isn't relevant, unless there was a change I missed (which
could totally be possible) that reverted the patch there and fixed that
issue in a different manner. But since I didn't figure that out until I
had finished researching it again:
Switching hats for a minute, this came from a bug a customer of
MontaVista found, so I can't give out the testcase :(
To repeat what Joakim said back then:
"I think I have figured this out. The first TLB misses that happen at
app startup is Data TLB misses. These will then hit the NULL L1 entry
and end up in do_page_fault() which will populate the L1 entry. But when
you have a very large app that spans more than one L1 entry (16 MB I
think) it may happen that you will have I-TLB Miss first one of the L1
entrys which will make the I-TLB handler bail out to do_page_fault() and
the app craches(SEGV)."
Looking at the patch again, what I don't see is why I talk about fudging
I-TLB Miss at 0x400 when it's I-TLB Error we fudge at being there, but
then get hung up that there can be a slight diff between the two ("This
is because we check bit 4 of SRR1 in both cases, but in the case of an
I-TLB Miss, this bit is always set, and it only indicates a protection
fault on an I-TLB Error.") so instead of 0x1300 jumping to the handler
at 0x400, we treat it like a regular exception so we know where we came
from, and perhaps missed fixing a case somewhere?
--
Tom Rini
http://gate.crashing.org/~trini/
next prev parent reply other threads:[~2005-11-07 16:37 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-11-07 14:32 [PATCH 2.6.14] mm: 8xx MM fix for Joakim Tjernlund
2005-11-07 10:16 ` Marcelo Tosatti
2005-11-07 15:51 ` Tom Rini [this message]
2005-11-07 16:02 ` Dan Malek
-- strict thread matches above, loose matches on Subject: below --
2005-11-30 17:34 Joakim Tjernlund
2005-11-07 18:37 Joakim Tjernlund
2005-11-12 19:28 ` Marcelo Tosatti
2005-11-13 12:47 ` Joakim Tjernlund
2005-11-16 8:39 ` Marcelo Tosatti
2005-11-07 18:14 Joakim Tjernlund
2005-11-07 18:22 ` Tom Rini
2005-11-08 0:46 ` Dan Malek
2005-11-07 15:44 Joakim Tjernlund
2005-11-07 11:12 ` Marcelo Tosatti
2005-10-30 20:03 Pantelis Antoniou
2005-10-30 21:16 ` Benjamin Herrenschmidt
2005-11-01 17:25 ` Marcelo Tosatti
2005-11-01 22:55 ` Pantelis Antoniou
2005-11-02 9:50 ` Marcelo Tosatti
2005-11-07 8:44 ` Marcelo Tosatti
2005-11-07 14:35 ` Dan Malek
2005-11-07 10:27 ` Marcelo Tosatti
2005-11-07 14:39 ` Pantelis Antoniou
2005-11-07 14:58 ` David Jander
2005-11-07 20:39 ` Benjamin Herrenschmidt
2005-11-07 17:02 ` Marcelo Tosatti
2005-11-07 20:50 ` Pantelis Antoniou
2005-11-08 0:44 ` Dan Malek
2005-11-09 12:04 ` Marcelo Tosatti
2005-11-10 7:48 ` David Jander
2005-11-10 8:18 ` David Jander
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20051107155147.GC3839@smtp.west.cox.net \
--to=trini@kernel.crashing.org \
--cc=dan@embeddedalley.com \
--cc=gtolstolytkin@ru.mvista.com \
--cc=joakim.tjernlund@transmode.se \
--cc=linuxppc-embedded@ozlabs.org \
--cc=marcelo.tosatti@cyclades.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).