public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH] flush_cache_page while pte valid
@ 2002-11-11 21:20 Manfred Spraul
  2002-11-12  0:37 ` Russell King
  0 siblings, 1 reply; 16+ messages in thread
From: Manfred Spraul @ 2002-11-11 21:20 UTC (permalink / raw)
  To: Hugh Dickins, Andrew Morton; +Cc: linux-kernel

> 	/* Nuke the page table entry. */
>+	flush_cache_page(vma, address);
> 	pte = ptep_get_and_clear(ptep);
> 	flush_tlb_page(vma, address);
>-	flush_cache_page(vma, address);
> 

Is it correct that this are 3 arch hooks that must appear back to back?
What about one hook with all parameters?

	pte = ptep_get_and_clear_and_flush(ptep, vma, address);

The current implementation just asks for such errors.

Documentation:
ptep_get_and_clear_and_flush() removes the page table entry *ptep from the page tables and clears all caches (tlb, cpu cache if virtually tagged).
The return value contains the final value of the pte. The critical information is the dirty bit in the pte - on SMP one cpu can call ptep_get_and_clear_and_flush() while another cpu accesses the mmap'ing from user space.


--
	Manfred



^ permalink raw reply	[flat|nested] 16+ messages in thread
* Re: [PATCH] flush_cache_page while pte valid
@ 2002-11-12 21:32 Ulrich Weigand
  0 siblings, 0 replies; 16+ messages in thread
From: Ulrich Weigand @ 2002-11-12 21:32 UTC (permalink / raw)
  To: manfred; +Cc: linux-kernel, uweigand

Manfred Spraul wrote:

>The assumption for the dirty bit is that the i386 cpu are the only cpus 
>in the world (TM) that maintain the dirty bit in hardware

Not true; the s390 also has a dirty bit in hardware ;-)

To make things more interesting, the s390 dirty bit does not
reside in the pte, but in a storage key associated with the
*physical* page.  The bit is set on any access to the physical
page, whether through any virtual mapping, by direct access
with dynamic address translation switched off, or by an I/O
device moving data directly to main memory.

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  weigand@informatik.uni-erlangen.de

^ permalink raw reply	[flat|nested] 16+ messages in thread
* Re: [PATCH] flush_cache_page while pte valid
@ 2002-11-12 20:53 Manfred Spraul
  2002-11-12 22:01 ` David S. Miller
  0 siblings, 1 reply; 16+ messages in thread
From: Manfred Spraul @ 2002-11-12 20:53 UTC (permalink / raw)
  To: Rik van Riel; +Cc: linux-kernel, davem

>
>
>>
>> The flush merely writes back the data, a copy-back operation, fully L2
>> cache coherent.  All cpus will see correct data if an intermittant
>> store occurs.
>
>The CPUs will, but the harddisk might not.
>  
>
The lost dirty bit can only happen on cpus where the hardware maintains 
a dirty bit. I doubt that the sparc cpus do that in hardware.

But like Hugh I don't understand how the cache writeback works on SMP.

cpu1:                cpu 2
                        access a mmaping, pte loaded into TLB
try_to_unmap_one()
flush_cache_page();
                        access the mmaping again. pte either still from
                        tlb, or reloaded from pte.
ptep_get_and_clear();
                        access the mmaping again, using the tlb
flush_tlb()
                        ??? How will the cpu write back now?

If the write back happens based on the tlb, then I don't understand why 
it's needed at all.

Regarding the dirty bit:
The assumption for the dirty bit is that the i386 cpu are the only cpus 
in the world (TM) that maintain the dirty bit in hardware, and tests on 
several i386 cpus have shown that the tlb walker retests the present bit 
before setting the dirty bit. Software tlb implementations must emulate 
that.

Thus it's guaranteed that
- if the dirty bit is not set in the result of ptep_get_and_clear, then 
no write operation has happened or will happen.
- if the dirty bit is set, then write operations could happen until the 
tlb flush.
- there will be no spuriously set dirty bits in the page tables.

--
    Manfred


^ permalink raw reply	[flat|nested] 16+ messages in thread
* Re: [PATCH] flush_cache_page while pte valid
@ 2002-11-12  1:08 Ulrich Weigand
  0 siblings, 0 replies; 16+ messages in thread
From: Ulrich Weigand @ 2002-11-12  1:08 UTC (permalink / raw)
  To: manfred; +Cc: linux-kernel, uweigand, schwidefsky

Manfred Spraul wrote:

>Is it correct that this are 3 arch hooks that must appear back to back?
>What about one hook with all parameters?
>
>        pte = ptep_get_and_clear_and_flush(ptep, vma, address);

This interface would help us on s390 very much.  We have a hardware
instruction (INVALIDATE PAGE TABLE ENTRY) that implements the combination 
of 
        pte = ptep_get_and_clear(ptep);
        flush_tlb_page(vma, address);

very efficiently, but we cannot use it to implement flush_tlb_page
alone, because it requires the pte to be valid.

This is why our current in-tree flush_tlb_page is quite inefficient
(it always flushes the complete TLB) ...

If we had a combined hook, we could use our IPTE instruction.

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  weigand@informatik.uni-erlangen.de

^ permalink raw reply	[flat|nested] 16+ messages in thread
* [PATCH] flush_cache_page while pte valid
@ 2002-11-11 18:25 Hugh Dickins
  2002-11-11 18:35 ` Andrew Morton
  2002-11-11 23:19 ` David S. Miller
  0 siblings, 2 replies; 16+ messages in thread
From: Hugh Dickins @ 2002-11-11 18:25 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Dave McCracken, Rik van Riel, David S. Miller, linux-kernel

On some architectures (cachetlb.txt gives HyperSparc as an example)
it is essential to flush_cache_page while pte is still valid: the
rmap VM diverged from the base 2.4 VM before that fix was made,
so this error has crept back into 2.5.

Patch below applies to 2.5.47 or 2.5.47-mm1 - needs more work over
shared pagetables, but they've silently fallen out of 2.5.47-mm1:
oversight?  I'll send Alan the equivalent for 2.4-ac shortly.

(I wonder, what happens if userspace now modifies the page
after the flush_cache_page, before the pte is invalidated?)

--- 2.5.47/mm/rmap.c	Mon Oct  7 20:37:50 2002
+++ linux/mm/rmap.c	Mon Nov 11 17:01:27 2002
@@ -393,9 +393,9 @@
 	}
 
 	/* Nuke the page table entry. */
+	flush_cache_page(vma, address);
 	pte = ptep_get_and_clear(ptep);
 	flush_tlb_page(vma, address);
-	flush_cache_page(vma, address);
 
 	/* Store the swap location in the pte. See handle_pte_fault() ... */
 	if (PageSwapCache(page)) {


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2002-11-12 22:52 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-11-11 21:20 [PATCH] flush_cache_page while pte valid Manfred Spraul
2002-11-12  0:37 ` Russell King
  -- strict thread matches above, loose matches on Subject: below --
2002-11-12 21:32 Ulrich Weigand
2002-11-12 20:53 Manfred Spraul
2002-11-12 22:01 ` David S. Miller
2002-11-12  1:08 Ulrich Weigand
2002-11-11 18:25 Hugh Dickins
2002-11-11 18:35 ` Andrew Morton
2002-11-11 23:19 ` David S. Miller
2002-11-12  6:53   ` Hugh Dickins
2002-11-12  6:53     ` David S. Miller
2002-11-12 14:49       ` Rik van Riel
2002-11-12 21:45         ` David S. Miller
2002-11-12 17:43       ` Hugh Dickins
2002-11-12 21:51         ` David S. Miller
2002-11-12 22:59           ` Hugh Dickins

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox