Re: [PATCH 2.6.14] mm: 8xx MM fix for

linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed

From: Marcelo Tosatti <marcelo.tosatti@cyclades.com>
To: Pantelis Antoniou <pantelis.antoniou@gmail.com>
Cc: Dan Malek <dan@embeddedalley.com>, linuxppc-embedded@ozlabs.org
Subject: Re: [PATCH 2.6.14] mm: 8xx MM fix for
Date: Mon, 7 Nov 2005 06:44:31 -0200	[thread overview]
Message-ID: <20051107084431.GA15180@logos.cnet> (raw)
In-Reply-To: <200510302203.25390.pantelis.antoniou@gmail.com>

Hi folks,

Seems the bug is exposed by the change which avoids flushing the
TLB when not necessary (in case the pte has not changed), introduced
recently:

__handle_mm_fault():

        entry = pte_mkyoung(entry);
        if (!pte_same(old_entry, entry)) {
                ptep_set_access_flags(vma, address, pte, entry, write_access);
                update_mmu_cache(vma, address, entry);
                lazy_mmu_prot_update(entry);
        } else {
                /*
                 * This is needed only for protection faults but the arch code
                 * is not yet telling us if this is a protection fault or not.
                 * This still avoids useless tlb flushes for .text page faults
                 * with threads.
                 */
                if (write_access)
                        flush_tlb_page(vma, address);
        }

The "update_mmu_cache()" call was unconditional before, which caused the TLB
to be flushed by:

        if (pfn_valid(pfn)) {
                struct page *page = pfn_to_page(pfn);
                if (!PageReserved(page)
                    && !test_bit(PG_arch_1, &page->flags)) {
                        if (vma->vm_mm == current->active_mm) {
#ifdef CONFIG_8xx
                        /* On 8xx, cache control instructions (particularly 
                         * "dcbst" from flush_dcache_icache) fault as write 
                         * operation if there is an unpopulated TLB entry 
                         * for the address in question. To workaround that, 
                         * we invalidate the TLB here, thus avoiding dcbst 
                         * misbehaviour.
                         */
                                _tlbie(address);
#endif
                                __flush_dcache_icache((void *) address);
                        } else
                                flush_dcache_icache_page(page);
                        set_bit(PG_arch_1, &page->flags);
                }

Which worked to due to pure luck: PG_arch_1 was always unset before, but
now it isnt.

The root of the problem are the changes against the 8xx TLB handlers introduced
during v2.6. What happens is the TLBMiss handlers load the zeroed pte into
the TLB, causing the TLBError handler to be invoked (thats two TLB faults per 
pagefault), which then jumps to the generic MM code to setup the pte.

The bug is that the zeroed TLB is not invalidated (the same reason
for the "dcbst" misbehaviour), resulting in infinite TLBError faults.

Dan, I wonder why we just don't go back to v2.4 behaviour. It is not very
clear to me that "two exception" speedup offsets the additional code required
for "one exception" version. Have you actually done any measurements? 

There is chance that the additional code ends up in the same cacheline,
which would mean no huge gain by the "two exception" approach. Might be
even harmful for performance (you need two exceptions instead of one
after all).

The "two exception" approach requires a TLB flush (to nuke the zeroed)
at each PTE update for correct behaviour (which BTW is another slowdown):

--- ../git/linux-2.6/arch/ppc/mm/init.c	2005-11-01 07:58:12.000000000 -0600
+++ linux-2.6-git-wednov02/arch/ppc/mm/init.c	2005-11-07 06:13:58.000000000 -0600
@@ -597,19 +597,12 @@
 
 	if (pfn_valid(pfn)) {
 		struct page *page = pfn_to_page(pfn);
+#ifdef CONFIG_8xx
+		_tlbie(address);
+#endif
 		if (!PageReserved(page)
 		    && !test_bit(PG_arch_1, &page->flags)) {
 			if (vma->vm_mm == current->active_mm) {
-#ifdef CONFIG_8xx
-			/* On 8xx, cache control instructions (particularly 
-		 	 * "dcbst" from flush_dcache_icache) fault as write 
-			 * operation if there is an unpopulated TLB entry 
-			 * for the address in question. To workaround that, 
-			 * we invalidate the TLB here, thus avoiding dcbst 
-			 * misbehaviour.
-			 */
-				_tlbie(address);
-#endif
 				__flush_dcache_icache((void *) address);
 			} else
 				flush_dcache_icache_page(page);


On Sun, Oct 30, 2005 at 11:03:24PM +0300, Pantelis Antoniou wrote:
> Latest MMU changes caused 8xx to stop working. Flushing tlb of the faulting
> address fixes the problem.
> 
> ---
> commit 978e2f36b1ae53e37ba27b3ab8f1c5ddbb8c8a10
> tree 7dd0e403c240162b1925db0834d694f4b4a0e95e
> parent ca02ea5aebcda886d1552c6af73ca96c02bf9fed
> author Pantelis Antoniou <panto@pantathon> Sun, 30 Oct 2005 21:53:48 +0200
> committer Pantelis Antoniou <panto@pantathon> Sun, 30 Oct 2005 21:53:48 +0200
> 
>  arch/ppc/mm/fault.c |   13 +++++++++++++
>  1 files changed, 13 insertions(+), 0 deletions(-)
> 
> diff --git a/arch/ppc/mm/fault.c b/arch/ppc/mm/fault.c
> --- a/arch/ppc/mm/fault.c
> +++ b/arch/ppc/mm/fault.c
> @@ -240,6 +240,19 @@ good_area:
>  			goto bad_area;
>  		if (!(vma->vm_flags & (VM_READ | VM_EXEC)))
>  			goto bad_area;
> +
> +#ifdef CONFIG_8xx
> +		{
> +		/* 8xx is retarded; news at 11 */
> +		pte_t *ptep = NULL;
> +
> +		if (get_pteptr(mm, address, &ptep) && pte_present(*ptep))
> +			_tlbie(address);
> +
> +		if (ptep != NULL)
> +			pte_unmap(ptep);
> +		}
> +#endif
>  	}
>  
>  	/*

next prev parent reply	other threads:[~2005-11-07 13:45 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-10-30 20:03 [PATCH 2.6.14] mm: 8xx MM fix for Pantelis Antoniou
2005-10-30 21:16 ` Benjamin Herrenschmidt
2005-11-01 17:25 ` Marcelo Tosatti
2005-11-01 22:55   ` Pantelis Antoniou
2005-11-02  9:50     ` Marcelo Tosatti
2005-11-07  8:44 ` Marcelo Tosatti [this message]
2005-11-07 14:35   ` Dan Malek
2005-11-07 10:27     ` Marcelo Tosatti
2005-11-07 14:39   ` Pantelis Antoniou
2005-11-07 14:58   ` David Jander
2005-11-07 20:39   ` Benjamin Herrenschmidt
2005-11-07 17:02     ` Marcelo Tosatti
2005-11-07 20:50     ` Pantelis Antoniou
2005-11-08  0:44       ` Dan Malek
2005-11-09 12:04     ` Marcelo Tosatti
2005-11-10  7:48       ` David Jander
2005-11-10  8:18         ` David Jander
  -- strict thread matches above, loose matches on Subject: below --
2005-11-07 14:32 Joakim Tjernlund
2005-11-07 10:16 ` Marcelo Tosatti
2005-11-07 15:51   ` Tom Rini
2005-11-07 16:02 ` Dan Malek
2005-11-07 15:44 Joakim Tjernlund
2005-11-07 11:12 ` Marcelo Tosatti
2005-11-07 18:14 Joakim Tjernlund
2005-11-07 18:22 ` Tom Rini
2005-11-08  0:46   ` Dan Malek
2005-11-07 18:37 Joakim Tjernlund
2005-11-12 19:28 ` Marcelo Tosatti
2005-11-13 12:47   ` Joakim Tjernlund
2005-11-16  8:39     ` Marcelo Tosatti
2005-11-30 17:34 Joakim Tjernlund

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20051107084431.GA15180@logos.cnet \
    --to=marcelo.tosatti@cyclades.com \
    --cc=dan@embeddedalley.com \
    --cc=linuxppc-embedded@ozlabs.org \
    --cc=pantelis.antoniou@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).