linux-arch.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH 1/3] Fix COW D-cache aliasing on fork
       [not found]               ` <4538FDBC.6070301@yahoo.com.au>
@ 2006-10-20 17:16                 ` Linus Torvalds
  2006-10-20 17:37                   ` Nick Piggin
  0 siblings, 1 reply; 26+ messages in thread
From: Linus Torvalds @ 2006-10-20 17:16 UTC (permalink / raw)
  To: Nick Piggin
  Cc: David Miller, ralf, Andrew Morton, Linux Kernel Mailing List,
	anemo, linux-arch, Martin Schwidefsky



On Sat, 21 Oct 2006, Nick Piggin wrote:
> > So maybe the COW D$ aliasing patch-series is just the right thing to do. Not
> > worry about D$ at _all_ when doing the actual fork, and only worry about it
> > on an actual COW event. Hmm?
> 
> Well if we have the calls in there, we should at least make them work
> right for the architectures there now. At the moment the flush_cache_mm
> before the copy_page_range wouldn't seem to do anything if you can still
> have threads dirty the cache again through existing TLB entries.
>
> I don't think that flushing on COW is exactly right though, because dirty
> data can remain invisible if you're only doing reads (no write, no flush).

You're right. A virtually indexed cache needs the flush _before_ we return 
from the fork into a new process (since otherwise the dirty data won't be 
visible in the new virtual address space).

So you've convinced me. Flushing at COW time _cannot_ be right, because it 
by definition means that there has been a time when the new process didn't 
see the dirty data in the case of a virtual index. And in the case of a 
physical index it cannot matter.

So I think the right thing to do is to forget about the COW D$ series 
(which probably _hides_ most of the problems in practice, so it "works" 
that way) and instead go with Ralf's last patch that just moves the 
flush_cache_mm() to after the TLB flush.

We do need to have all the architecture people (especially S390, which has 
been very strange in this regard in the past) check that it's ok. The 
_mappings_ are still valid, so S390 should be able to do the write-back, 
but there may be architectures that would want to do the flush _both_ 
before and after (for performance reasons - if writing out dirty data 
requires a TLB lookup, doing most fo the writeback before is probably a 
better thing, and then we can do a _second_ writeback after the flush to 
close the race with some other thread dirtying the pages before the TLB 
was marked read-only).

I added linux-arch and Martin Schwidefsky (s390) to the Cc:.

Guys, in case you missed the earlier discussion: there's a suggested patch 
by Ralf Baechle on linux-kernel (but it does just the "flush after" 
version, not the "perhaps we need it both before and after" thing I 
theorise about above). Message-ID: 20061020160538.GB18649@linux-mips.org.

		Linus

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/3] Fix COW D-cache aliasing on fork
  2006-10-20 17:16                 ` [PATCH 1/3] Fix COW D-cache aliasing on fork Linus Torvalds
@ 2006-10-20 17:37                   ` Nick Piggin
  0 siblings, 0 replies; 26+ messages in thread
From: Nick Piggin @ 2006-10-20 17:37 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: David Miller, ralf, Andrew Morton, Linux Kernel Mailing List,
	anemo, linux-arch, Martin Schwidefsky

Linus Torvalds wrote:
> 
> On Sat, 21 Oct 2006, Nick Piggin wrote:
> 
>>>So maybe the COW D$ aliasing patch-series is just the right thing to do. Not
>>>worry about D$ at _all_ when doing the actual fork, and only worry about it
>>>on an actual COW event. Hmm?
>>
>>Well if we have the calls in there, we should at least make them work
>>right for the architectures there now. At the moment the flush_cache_mm
>>before the copy_page_range wouldn't seem to do anything if you can still
>>have threads dirty the cache again through existing TLB entries.
>>
>>I don't think that flushing on COW is exactly right though, because dirty
>>data can remain invisible if you're only doing reads (no write, no flush).
> 
> 
> You're right. A virtually indexed cache needs the flush _before_ we return 
> from the fork into a new process (since otherwise the dirty data won't be 
> visible in the new virtual address space).
> 
> So you've convinced me. Flushing at COW time _cannot_ be right, because it 
> by definition means that there has been a time when the new process didn't 
> see the dirty data in the case of a virtual index. And in the case of a 
> physical index it cannot matter.
> 
> So I think the right thing to do is to forget about the COW D$ series 
> (which probably _hides_ most of the problems in practice, so it "works" 
> that way) and instead go with Ralf's last patch that just moves the 
> flush_cache_mm() to after the TLB flush.

So long as we don't move around the mmap semaphores, I'm OK with that
patch...

> We do need to have all the architecture people (especially S390, which has 
> been very strange in this regard in the past) check that it's ok. The 
> _mappings_ are still valid, so S390 should be able to do the write-back, 
> but there may be architectures that would want to do the flush _both_ 
> before and after (for performance reasons - if writing out dirty data 
> requires a TLB lookup, doing most fo the writeback before is probably a 
> better thing, and then we can do a _second_ writeback after the flush to 
> close the race with some other thread dirtying the pages before the TLB 
> was marked read-only).

Yes, that's my theory too. Probably the thing to aim for is replacing
that API with a new single call to flush caches and tlbs, and the
arch can do what best suits.

But for now, to get it actually *working*, moving the flush_cache_mm
seems like the first step.

> I added linux-arch and Martin Schwidefsky (s390) to the Cc:.
> 
> Guys, in case you missed the earlier discussion: there's a suggested patch 
> by Ralf Baechle on linux-kernel (but it does just the "flush after" 
> version, not the "perhaps we need it both before and after" thing I 
> theorise about above). Message-ID: 20061020160538.GB18649@linux-mips.org.

As I mentioned there, we probably want to to check that other places
which flush caches before invalidating TLBs (eg. most of the kernel) is
OK in the presence of concurrent writes to valid TLBs from other threads.

-- 
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/3] Fix COW D-cache aliasing on fork
       [not found]               ` <20061020.125851.115909797.davem@davemloft.net>
@ 2006-10-20 20:10                 ` Linus Torvalds
  2006-10-20 20:59                   ` Russell King
                                     ` (2 more replies)
  0 siblings, 3 replies; 26+ messages in thread
From: Linus Torvalds @ 2006-10-20 20:10 UTC (permalink / raw)
  To: David Miller
  Cc: nickpiggin, ralf, Andrew Morton, Linux Kernel Mailing List, anemo,
	linux-arch, Martin Schwidefsky



On Fri, 20 Oct 2006, David Miller wrote:
> 
> I did some more digging, here's what I think the hardware actually
> does:

Ok, this sounds sane.

What should we do about this? How does this patch look to people?

(Totally untested, and I'm not sure we should even do that whole 
"oldmm->mm_users" test, but I'm throwing it out here for discussion, in 
case it matters for performance. The second D$ flush should obviously be 
unnecessary for the common unthreaded case, which is why none of this has 
mattered historically, I think).

Comments? We need ARM, MIPS, sparc and S390 at the very least to sign off 
on this, and somebody to write a nice explanation for the changelog (and 
preferably do this through -mm too).

		Linus

---
diff --git a/kernel/fork.c b/kernel/fork.c
index 29ebb30..14c6a1d 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -287,8 +287,18 @@ static inline int dup_mmap(struct mm_str
 	}
 	retval = 0;
 out:
-	up_write(&mm->mmap_sem);
 	flush_tlb_mm(oldmm);
+	/*
+	 * If we have other threads using the old mm, we need to
+	 * flush the D$ again - the other threads might have dirtied
+	 * it more before the TLB got flushed.
+	 *
+	 * After the flush, they can no longer dirty more pages,
+	 * since they are now marked read-only, of course.
+	 */
+	if (atomic_read(&oldmm->mm_users) != 1)
+		flush_cache_mm(oldmm);
+	up_write(&mm->mmap_sem);
 	up_write(&oldmm->mmap_sem);
 	return retval;
 fail_nomem_policy:

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/3] Fix COW D-cache aliasing on fork
  2006-10-20 20:10                 ` Linus Torvalds
@ 2006-10-20 20:59                   ` Russell King
  2006-10-20 21:06                     ` David Miller
  2006-10-20 21:12                     ` Linus Torvalds
  2006-10-20 21:49                   ` Ralf Baechle
  2006-10-23  8:50                   ` Martin Schwidefsky
  2 siblings, 2 replies; 26+ messages in thread
From: Russell King @ 2006-10-20 20:59 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: David Miller, nickpiggin, ralf, Andrew Morton,
	Linux Kernel Mailing List, anemo, linux-arch, Martin Schwidefsky

On Fri, Oct 20, 2006 at 01:10:59PM -0700, Linus Torvalds wrote:
> On Fri, 20 Oct 2006, David Miller wrote:
> > I did some more digging, here's what I think the hardware actually
> > does:
> 
> Ok, this sounds sane.
> 
> What should we do about this? How does this patch look to people?
> 
> (Totally untested, and I'm not sure we should even do that whole 
> "oldmm->mm_users" test, but I'm throwing it out here for discussion, in 
> case it matters for performance. The second D$ flush should obviously be 
> unnecessary for the common unthreaded case, which is why none of this has 
> mattered historically, I think).
> 
> Comments? We need ARM, MIPS, sparc and S390 at the very least to sign off 
> on this, and somebody to write a nice explanation for the changelog (and 
> preferably do this through -mm too).

Well, looking at do_wp_page() I'm now quite concerned about ARM and COW.
I can't see how this code could _possibly_ work with a virtually indexed
cache as it stands.  Yet, the kernel does appear to work.

I'm afraid I'm utterly confused with the Linux MM in this day and age, so
I don't think I can even consider commenting on this change.

The majority of ARM caches are VIVT, so data read via the kernel mappings
definitely does not hit the same cache lines as data accessed via the user
mappings.

Our copy_user_page() function merely copies between the two kernel mappings
of the pages so with VIVT caches the kernel mappings - as it always has done
since it's original invention.

However, when I look at this code now, I see _no where_ where we synchronise
the cache between the userspace mapping and the kernel space mapping before
copying a COW page.

So I'm afraid I'm going to have to hold up my hand and say "I don't
understand the Linux MM anymore".

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:  2.6 Serial core

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/3] Fix COW D-cache aliasing on fork
  2006-10-20 20:59                   ` Russell King
@ 2006-10-20 21:06                     ` David Miller
  2006-10-20 21:17                       ` Russell King
  2006-10-20 21:12                     ` Linus Torvalds
  1 sibling, 1 reply; 26+ messages in thread
From: David Miller @ 2006-10-20 21:06 UTC (permalink / raw)
  To: rmk+lkml
  Cc: torvalds, nickpiggin, ralf, akpm, linux-kernel, anemo, linux-arch,
	schwidefsky

From: Russell King <rmk+lkml@arm.linux.org.uk>
Date: Fri, 20 Oct 2006 21:59:29 +0100

> However, when I look at this code now, I see _no where_ where we synchronise
> the cache between the userspace mapping and the kernel space mapping before
> copying a COW page.

When the user obtains write access to the page, we'll flush.

Since there are many locations at which write access can be
obtained, there are many locations where the synchronization
is obtained.

One popular way to obtain the synchronization is to implement
flush_dcache_page() to flush, and implement clear_page() and
copy_user_page() to clear and copy pages in kernel space at
special temporrary mappings whose virtual address will alias
up properly with userspace's mapping.  That's why we pass a
virtual address to these two arch functions.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/3] Fix COW D-cache aliasing on fork
  2006-10-20 20:59                   ` Russell King
  2006-10-20 21:06                     ` David Miller
@ 2006-10-20 21:12                     ` Linus Torvalds
  2006-10-20 21:28                       ` Russell King
  2006-10-20 21:41                       ` Ralf Baechle
  1 sibling, 2 replies; 26+ messages in thread
From: Linus Torvalds @ 2006-10-20 21:12 UTC (permalink / raw)
  To: Russell King
  Cc: David Miller, nickpiggin, ralf, Andrew Morton,
	Linux Kernel Mailing List, anemo, linux-arch, Martin Schwidefsky



On Fri, 20 Oct 2006, Russell King wrote:
> 
> Well, looking at do_wp_page() I'm now quite concerned about ARM and COW.
> I can't see how this code could _possibly_ work with a virtually indexed
> cache as it stands.  Yet, the kernel does appear to work.

It really shouldn't need any extra code, exactly because by the time it 
hits any page-fault, the caches had better be in sync with the physical 
page contents _anyway_ (yes, being virtual, the caches will _duplicate_ 
the contents, but since the pages are read-only, that aliasing should be 
perfectly fine).

> I'm afraid I'm utterly confused with the Linux MM in this day and age, so
> I don't think I can even consider commenting on this change.

Well, we'd need somebody to verify that it still works, but quite frankly, 
the likelihood of it breaking anything seems basically nil.

> However, when I look at this code now, I see _no where_ where we synchronise
> the cache between the userspace mapping and the kernel space mapping before
> copying a COW page.

At the COW, it should be synchronized already, exactly because we did the 
cache_flush_mm() when we _created_ the COW mapping in the first place.

It's just that we weren't quite careful enough at that time (and even 
then, that would only matter for some really really unlikely and strange 
situations that only happen when you fork() from a _threaded_ environment, 
so it shouldn't be anything you'd notice under normal load).

I think.

> So I'm afraid I'm going to have to hold up my hand and say "I don't
> understand the Linux MM anymore".

There are few enough people who understand it even though they're supposed 
to. I certainly have to always go back and look and read the code when 
there is anything subtle going on, and even then I want to be backed up by 
one of the _competent_ people ;)

			Linus

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/3] Fix COW D-cache aliasing on fork
  2006-10-20 21:06                     ` David Miller
@ 2006-10-20 21:17                       ` Russell King
  2006-10-20 21:30                         ` David Miller
  0 siblings, 1 reply; 26+ messages in thread
From: Russell King @ 2006-10-20 21:17 UTC (permalink / raw)
  To: David Miller
  Cc: torvalds, nickpiggin, ralf, akpm, linux-kernel, anemo, linux-arch,
	schwidefsky

On Fri, Oct 20, 2006 at 02:06:19PM -0700, David Miller wrote:
> From: Russell King <rmk+lkml@arm.linux.org.uk>
> Date: Fri, 20 Oct 2006 21:59:29 +0100
> 
> > However, when I look at this code now, I see _no where_ where we synchronise
> > the cache between the userspace mapping and the kernel space mapping before
> > copying a COW page.
> 
> When the user obtains write access to the page, we'll flush.
> 
> Since there are many locations at which write access can be
> obtained, there are many locations where the synchronization
> is obtained.
> 
> One popular way to obtain the synchronization is to implement
> flush_dcache_page() to flush, and implement clear_page() and
> copy_user_page() to clear and copy pages in kernel space at
> special temporrary mappings whose virtual address will alias
> up properly with userspace's mapping.  That's why we pass a
> virtual address to these two arch functions.

I did say I had a VIVT cache.  With such a cache, the *only* place where
you can read data written via one mapping is via that very same mapping.
There is no other virtual address which will give you coherent access to
the data in another mapping.

The majority of ARMs to date have been VIVT, and the majority of Linux
kernels have worked fine (albiet the "recent" breakage of PIO block IO.)

I'm now in the situation where I come back to look at the MM code and,
to put it quite frankly, I can't see any possible way for ARM to work
with this code.  In practice, however, it does appear to work.  I just
can't see _why_ it's working.

Hence why I'm declaring the "I don't understand" flag and refraining to
endorse the patch - I _can't_ endorse what I don't understand.

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:  2.6 Serial core

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/3] Fix COW D-cache aliasing on fork
  2006-10-20 21:12                     ` Linus Torvalds
@ 2006-10-20 21:28                       ` Russell King
  2006-10-20 21:41                       ` Ralf Baechle
  1 sibling, 0 replies; 26+ messages in thread
From: Russell King @ 2006-10-20 21:28 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: David Miller, nickpiggin, ralf, Andrew Morton,
	Linux Kernel Mailing List, anemo, linux-arch, Martin Schwidefsky

On Fri, Oct 20, 2006 at 02:12:11PM -0700, Linus Torvalds wrote:
> On Fri, 20 Oct 2006, Russell King wrote:
> > Well, looking at do_wp_page() I'm now quite concerned about ARM and COW.
> > I can't see how this code could _possibly_ work with a virtually indexed
> > cache as it stands.  Yet, the kernel does appear to work.
> 
> It really shouldn't need any extra code, exactly because by the time it 
> hits any page-fault, the caches had better be in sync with the physical 
> page contents _anyway_ (yes, being virtual, the caches will _duplicate_ 
> the contents, but since the pages are read-only, that aliasing should be 
> perfectly fine).

Oh, of course!  That explains why it actually works as expected!  Thanks
for filling back in that bit of swapped-out-years-ago-and-lost information.

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:  2.6 Serial core

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/3] Fix COW D-cache aliasing on fork
  2006-10-20 21:17                       ` Russell King
@ 2006-10-20 21:30                         ` David Miller
  0 siblings, 0 replies; 26+ messages in thread
From: David Miller @ 2006-10-20 21:30 UTC (permalink / raw)
  To: rmk+lkml
  Cc: torvalds, nickpiggin, ralf, akpm, linux-kernel, anemo, linux-arch,
	schwidefsky

From: Russell King <rmk+lkml@arm.linux.org.uk>
Date: Fri, 20 Oct 2006 22:17:23 +0100

> I did say I had a VIVT cache.

And everything I said was premised with this understanding :-)

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/3] Fix COW D-cache aliasing on fork
  2006-10-20 21:12                     ` Linus Torvalds
  2006-10-20 21:28                       ` Russell King
@ 2006-10-20 21:41                       ` Ralf Baechle
  2006-10-21 16:28                         ` Atsushi Nemoto
  1 sibling, 1 reply; 26+ messages in thread
From: Ralf Baechle @ 2006-10-20 21:41 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Russell King, David Miller, nickpiggin, Andrew Morton,
	Linux Kernel Mailing List, anemo, linux-arch, Martin Schwidefsky,
	James Bottomley

On Fri, Oct 20, 2006 at 02:12:11PM -0700, Linus Torvalds wrote:

> > Well, looking at do_wp_page() I'm now quite concerned about ARM and COW.
> > I can't see how this code could _possibly_ work with a virtually indexed
> > cache as it stands.  Yet, the kernel does appear to work.
> 
> It really shouldn't need any extra code, exactly because by the time it 
> hits any page-fault, the caches had better be in sync with the physical 
> page contents _anyway_ (yes, being virtual, the caches will _duplicate_ 
> the contents, but since the pages are read-only, that aliasing should be 
> perfectly fine).

Until yesterday I also thought multiple read-only copies wouldn't do any
harm.  Well, until I learned about the wonderful behaviour of the PA8800
caches.  PA8800 has VIPT primary caches, PIPT secondary caches.  And the
sinister part - caches are exclusive, that is a cacheline is either in
L1 or L2 but never in both and can migrate between L1 and L2.  Now
onsider the following scenario:

 o physical address P is mapped to two aliasing addresses V1 and V2
 o a load from V1 results in a clean line in L1 caching P at index V1.
 o a store to V2 results in a clean line in L1 caching P at index V2.
 o the line at V2 is getting written back to memory.
 o a victim replacement of the line at V1 results in the _clean_ line
   migrating back from L1 to L2.

-> another read from V2 will return stale data.

As consequence flush_cache_mm() on PA (or at least PA8800) currently blows
away the entire cache, as Kyle McMartin just told me.  The whole 1.5MB L1
and 32MB of L2 making fork an ultraheavy operation.

> It's just that we weren't quite careful enough at that time (and even 
> then, that would only matter for some really really unlikely and strange 
> situations that only happen when you fork() from a _threaded_ environment, 
> so it shouldn't be anything you'd notice under normal load).
> 
> I think.

The flush is there since a very long time.  I have it in my tree since
~ 2.1.36 and I get the feeling anybody every has been seriously revisited
the issue since.

  Ralf

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/3] Fix COW D-cache aliasing on fork
  2006-10-20 20:10                 ` Linus Torvalds
  2006-10-20 20:59                   ` Russell King
@ 2006-10-20 21:49                   ` Ralf Baechle
  2006-10-20 22:02                     ` Linus Torvalds
  2006-10-23  8:50                   ` Martin Schwidefsky
  2 siblings, 1 reply; 26+ messages in thread
From: Ralf Baechle @ 2006-10-20 21:49 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: David Miller, nickpiggin, Andrew Morton,
	Linux Kernel Mailing List, anemo, linux-arch, Martin Schwidefsky,
	James Bottomley

On Fri, Oct 20, 2006 at 01:10:59PM -0700, Linus Torvalds wrote:

> Ok, this sounds sane.
> 
> What should we do about this? How does this patch look to people?
> 
> (Totally untested, and I'm not sure we should even do that whole 
> "oldmm->mm_users" test, but I'm throwing it out here for discussion, in 
> case it matters for performance. The second D$ flush should obviously be 
> unnecessary for the common unthreaded case, which is why none of this has 
> mattered historically, I think).
> 
> Comments? We need ARM, MIPS, sparc and S390 at the very least to sign off 
> on this, and somebody to write a nice explanation for the changelog (and 
> preferably do this through -mm too).

As a minimal solution your patch would work for MIPS but performance would be
suboptimal.

With my D-cache alias series applied the flush_cache_mm() in dup_mmap()
becomes entirely redundant.  When I delete the call (not part of my patchset)
it means 12% faster fork.  But I'm not proposing this for 2.6.19.

Note this does not make the flush_cache_mm() on process termination
redundant ...

  Ralf

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/3] Fix COW D-cache aliasing on fork
  2006-10-20 21:49                   ` Ralf Baechle
@ 2006-10-20 22:02                     ` Linus Torvalds
  2006-10-20 22:22                       ` David Miller
  0 siblings, 1 reply; 26+ messages in thread
From: Linus Torvalds @ 2006-10-20 22:02 UTC (permalink / raw)
  To: Ralf Baechle
  Cc: David Miller, nickpiggin, Andrew Morton,
	Linux Kernel Mailing List, anemo, linux-arch, Martin Schwidefsky,
	James Bottomley



On Fri, 20 Oct 2006, Ralf Baechle wrote:
> 
> As a minimal solution your patch would work for MIPS but performance would be
> suboptimal.

Not so.

> With my D-cache alias series applied the flush_cache_mm() in dup_mmap()
> becomes entirely redundant.

No it does not, as pointed out by  Nick.

If there are dirty lines in the virtual cache, they _must_ be flushd long 
before the COW happens. Because if they are not, they don't show up in the 
child of the fork (which only sees it's _own_ virtual cache). See?

> When I delete the call (not part of my patchset) it means 12% faster 
> fork.  But I'm not proposing this for 2.6.19.

I just suspect it means a _buggy_ fork.

It so happens (I think), that fork is big enough that it probably flushes 
the L1 cache _anyway_. 

Does MIPS have some kind of "flush_cache_mm()" that could only flush 
user-level caches? Maybe the overhead is from flushing all dirty 
cachelines, regardless of whether they are kernel or not (and dirty kernel 
cachelines are going to be the most common by far in that path).

		Linus

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/3] Fix COW D-cache aliasing on fork
  2006-10-20 22:02                     ` Linus Torvalds
@ 2006-10-20 22:22                       ` David Miller
  2006-10-20 22:51                         ` Ralf Baechle
  0 siblings, 1 reply; 26+ messages in thread
From: David Miller @ 2006-10-20 22:22 UTC (permalink / raw)
  To: torvalds
  Cc: ralf, nickpiggin, akpm, linux-kernel, anemo, linux-arch,
	schwidefsky, James.Bottomley

From: Linus Torvalds <torvalds@osdl.org>
Date: Fri, 20 Oct 2006 15:02:39 -0700 (PDT)

> On Fri, 20 Oct 2006, Ralf Baechle wrote:
> > When I delete the call (not part of my patchset) it means 12% faster 
> > fork.  But I'm not proposing this for 2.6.19.
> 
> I just suspect it means a _buggy_ fork.
> 
> It so happens (I think), that fork is big enough that it probably flushes 
> the L1 cache _anyway_. 

My understanding is that this works because in Ralf's original patch
(which is the context in which he is removing the flush_cache_mm()
call), he uses kmap()/kunmap() to map the page(s) being accessed at a
kernel virtual address which will fall into the same cache color as
the user virtual address --> no alias problems.

Since he does this for every page touched on the kernel side during
dup_mmap(), the existing flush_cache_mm() call in dup_mmap() does in
fact become redundant.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/3] Fix COW D-cache aliasing on fork
  2006-10-20 22:22                       ` David Miller
@ 2006-10-20 22:51                         ` Ralf Baechle
  2006-10-20 23:28                           ` Linus Torvalds
  0 siblings, 1 reply; 26+ messages in thread
From: Ralf Baechle @ 2006-10-20 22:51 UTC (permalink / raw)
  To: David Miller
  Cc: torvalds, nickpiggin, akpm, linux-kernel, anemo, linux-arch,
	schwidefsky, James.Bottomley

On Fri, Oct 20, 2006 at 03:22:47PM -0700, David Miller wrote:

> > On Fri, 20 Oct 2006, Ralf Baechle wrote:
> > > When I delete the call (not part of my patchset) it means 12% faster 
> > > fork.  But I'm not proposing this for 2.6.19.
> > 
> > I just suspect it means a _buggy_ fork.
> > 
> > It so happens (I think), that fork is big enough that it probably flushes 
> > the L1 cache _anyway_. 

I doubt it; I've tested this on 64K I-cache VIPT, 64K D-cache VIPT.

> My understanding is that this works because in Ralf's original patch
> (which is the context in which he is removing the flush_cache_mm()
> call), he uses kmap()/kunmap() to map the page(s) being accessed at a
> kernel virtual address which will fall into the same cache color as
> the user virtual address --> no alias problems.
>
> Since he does this for every page touched on the kernel side during
> dup_mmap(), the existing flush_cache_mm() call in dup_mmap() does in
> fact become redundant.

Correct.

It means no cache flush operation to deal with aliases at all left in
fork and COW code.

Another advantage of this strategy is that we will never have to handle
less virtual coherency exceptions.  A virtual coherency exception is raised
on some MIPS processors when they detect the creation of a cache alias.
This allows the software to cleanup caches.  Neat as an alarm system for
alias debugging but rather expensive to service if large numbers are
raised, not available on all processors and also detects the creation of
harmless aliases of clean lines, thus a slight annoyance.

  Ralf

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/3] Fix COW D-cache aliasing on fork
  2006-10-20 22:51                         ` Ralf Baechle
@ 2006-10-20 23:28                           ` Linus Torvalds
  2006-10-21  0:06                             ` Ralf Baechle
  0 siblings, 1 reply; 26+ messages in thread
From: Linus Torvalds @ 2006-10-20 23:28 UTC (permalink / raw)
  To: Ralf Baechle
  Cc: David Miller, nickpiggin, akpm, linux-kernel, anemo, linux-arch,
	schwidefsky, James.Bottomley



On Fri, 20 Oct 2006, Ralf Baechle wrote:
> 
> > My understanding is that this works because in Ralf's original patch
> > (which is the context in which he is removing the flush_cache_mm()
> > call), he uses kmap()/kunmap() to map the page(s) being accessed at a
> > kernel virtual address which will fall into the same cache color as
> > the user virtual address --> no alias problems.
> >
> > Since he does this for every page touched on the kernel side during
> > dup_mmap(), the existing flush_cache_mm() call in dup_mmap() does in
> > fact become redundant.
> 
> Correct.
> 
> It means no cache flush operation to deal with aliases at all left in
> fork and COW code.

Umm. That would seem to only happen to work for a direct-mapped virtually 
indexed cache where the index is taken purely from the virtual address, 
and there are no "process context" bits in the virtually indexed D$.

The moment there are process context bits involved, afaik you absolutely 
_need_ to flush, because otherwise the other process will never pick up 
the dirty state (which it would need to reload from memory).

That said, maybe nobody does that. Virtual caches are a total braindamage 
in the first place, so hopefully they have limited use.

		Linus

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/3] Fix COW D-cache aliasing on fork
  2006-10-20 23:28                           ` Linus Torvalds
@ 2006-10-21  0:06                             ` Ralf Baechle
  2006-10-21  0:38                               ` Linus Torvalds
  0 siblings, 1 reply; 26+ messages in thread
From: Ralf Baechle @ 2006-10-21  0:06 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: David Miller, nickpiggin, akpm, linux-kernel, anemo, linux-arch,
	schwidefsky, James.Bottomley

On Fri, Oct 20, 2006 at 04:28:37PM -0700, Linus Torvalds wrote:

> > > My understanding is that this works because in Ralf's original patch
> > > (which is the context in which he is removing the flush_cache_mm()
> > > call), he uses kmap()/kunmap() to map the page(s) being accessed at a
> > > kernel virtual address which will fall into the same cache color as
> > > the user virtual address --> no alias problems.
> > >
> > > Since he does this for every page touched on the kernel side during
> > > dup_mmap(), the existing flush_cache_mm() call in dup_mmap() does in
> > > fact become redundant.
> > 
> > Correct.
> > 
> > It means no cache flush operation to deal with aliases at all left in
> > fork and COW code.
> 
> Umm. That would seem to only happen to work for a direct-mapped virtually 
> indexed cache where the index is taken purely from the virtual address, 
> and there are no "process context" bits in the virtually indexed D$.

No MIPS processor has something like that.  See below.

> The moment there are process context bits involved, afaik you absolutely 
> _need_ to flush, because otherwise the other process will never pick up 
> the dirty state (which it would need to reload from memory).

Correct.

> That said, maybe nobody does that. Virtual caches are a total braindamage 
> in the first place, so hopefully they have limited use.

On MIPS we never had pure virtual caches.  The major variants in existence
are:

 o D-cache PIPT, I-cache PIPT
 o PIVT (no typo!)
   Only the R6000 has this and it's not supported by Linux.
 o D-cache VIPT, I-cache VIPT
   This is by far the most common on any MIPS designed since '91.
   A variant of these caches has hardware logic to detect cache aliases and
   fix them automatically and therefore is equivalent to PIPT even though
   they are not implemented as PIPT.  And obviously the alias replay of the
   pipe will cost a few cycles.  The R10000 family of SGI belongs into this
   class and the 24K/34K family of synthesizable cores by MIPS Technologies
   have this as a synthesis option.
   Another variant throws virtual coherency exceptions as I've explained in
   another thread.
 o D-cache PIPT, I-cache VIVT with additional address space tags.
 o Cacheless.  Not usually running Linux but heck, it's working anyway.

Be sure I'm sending a CPU designers a strong message about aliases.  And I
think they're slowly getting the message that kernel hackers like to poke
needles into voodoo dolls for aliases ;-)

  Ralf

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/3] Fix COW D-cache aliasing on fork
  2006-10-21  0:06                             ` Ralf Baechle
@ 2006-10-21  0:38                               ` Linus Torvalds
  2006-10-21  1:29                                 ` Paul Mackerras
                                                   ` (2 more replies)
  0 siblings, 3 replies; 26+ messages in thread
From: Linus Torvalds @ 2006-10-21  0:38 UTC (permalink / raw)
  To: Ralf Baechle
  Cc: David Miller, nickpiggin, akpm, linux-kernel, anemo, linux-arch,
	schwidefsky, James.Bottomley



On Sat, 21 Oct 2006, Ralf Baechle wrote:
>
> > That said, maybe nobody does that. Virtual caches are a total braindamage 
> > in the first place, so hopefully they have limited use.
> 
> On MIPS we never had pure virtual caches.

Ok, so on MIPS my schenario doesn't matter.

I think (but may be mistaken) that ARM _does_ have pure virtual caches 
with a process ID, but people have always ended up flushing them at 
context switch simply because it just causes too much trouble.

Sparc? VIPT too? Davem?

I have absolutely zero clue about s390.

Anyway, it sounds to me like this is too big to decide for 2.6.19 anyway, 
and as far as I can tell this i snot a regression, right? Ie we've always 
had the aliasing issue. Ralf?

But it would be good to have something for the early -rc1 sequence for 
2.6.20, and maybe the MIPS COW D$ patches are it, if it has performance 
advantages on MIPS that can also be translated to other virtual cache 
users..

> Be sure I'm sending a CPU designers a strong message about aliases.

Castration. That's the best solution. We don't want those people 
procreating.

			Linus

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/3] Fix COW D-cache aliasing on fork
  2006-10-21  0:38                               ` Linus Torvalds
@ 2006-10-21  1:29                                 ` Paul Mackerras
  2006-10-21  2:11                                 ` David Miller
  2006-12-02  9:49                                 ` Russell King
  2 siblings, 0 replies; 26+ messages in thread
From: Paul Mackerras @ 2006-10-21  1:29 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Ralf Baechle, David Miller, nickpiggin, akpm, linux-kernel, anemo,
	linux-arch, schwidefsky, James.Bottomley

Linus Torvalds writes:

> I think (but may be mistaken) that ARM _does_ have pure virtual caches 
> with a process ID, but people have always ended up flushing them at 
> context switch simply because it just causes too much trouble.
> 
> Sparc? VIPT too? Davem?

There is one PowerPC embedded chip family, the PPC440, which has a
virtual i-cache with a process ID tag.  The d-cache is sane though.
Of course, the i-cache being readonly means we avoid the nastier
issues.

Paul.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/3] Fix COW D-cache aliasing on fork
  2006-10-21  0:38                               ` Linus Torvalds
  2006-10-21  1:29                                 ` Paul Mackerras
@ 2006-10-21  2:11                                 ` David Miller
  2006-10-21  2:37                                   ` Linus Torvalds
  2006-12-02  9:49                                 ` Russell King
  2 siblings, 1 reply; 26+ messages in thread
From: David Miller @ 2006-10-21  2:11 UTC (permalink / raw)
  To: torvalds
  Cc: ralf, nickpiggin, akpm, linux-kernel, anemo, linux-arch,
	schwidefsky, James.Bottomley

From: Linus Torvalds <torvalds@osdl.org>
Date: Fri, 20 Oct 2006 17:38:32 -0700 (PDT)

> I think (but may be mistaken) that ARM _does_ have pure virtual caches 
> with a process ID, but people have always ended up flushing them at 
> context switch simply because it just causes too much trouble.
> 
> Sparc? VIPT too? Davem?

sun4c is VIVT, but has no SMP variants.
sun4m has both VIPT and PIPT.

> But it would be good to have something for the early -rc1 sequence for 
> 2.6.20, and maybe the MIPS COW D$ patches are it, if it has performance 
> advantages on MIPS that can also be translated to other virtual cache 
> users..

I think it could help for sun4m highmem configs.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/3] Fix COW D-cache aliasing on fork
  2006-10-21  2:11                                 ` David Miller
@ 2006-10-21  2:37                                   ` Linus Torvalds
  2006-10-21  2:46                                     ` David Miller
                                                       ` (2 more replies)
  0 siblings, 3 replies; 26+ messages in thread
From: Linus Torvalds @ 2006-10-21  2:37 UTC (permalink / raw)
  To: David Miller
  Cc: ralf, nickpiggin, akpm, linux-kernel, anemo, linux-arch,
	schwidefsky, James.Bottomley



On Fri, 20 Oct 2006, David Miller wrote:
>
> From: Linus Torvalds <torvalds@osdl.org>
> Date: Fri, 20 Oct 2006 17:38:32 -0700 (PDT)
> 
> > I think (but may be mistaken) that ARM _does_ have pure virtual caches 
> > with a process ID, but people have always ended up flushing them at 
> > context switch simply because it just causes too much trouble.
> > 
> > Sparc? VIPT too? Davem?
> 
> sun4c is VIVT, but has no SMP variants.

You don't need SMP - we have sleeping sections here, so even threads on UP 
can trigger it. 

Now, to trigger it you need to have
 - virtual indexing not just by  address, but by some "address space 
   identifier" thing too
 - (in practice) a big enough cache that switching tasks wouldn't flush it 
   anyway.

> sun4m has both VIPT and PIPT.
> 
> > But it would be good to have something for the early -rc1 sequence for 
> > 2.6.20, and maybe the MIPS COW D$ patches are it, if it has performance 
> > advantages on MIPS that can also be translated to other virtual cache 
> > users..
> 
> I think it could help for sun4m highmem configs.

Well, if you can re-create the performance numbers (Ralf - can you send 
the full series with the final "remove the now unnecessary flush" to 
Davem?), that will make deciding things easier, I think.

I suspect sparc, mips and arm are the main architectures where virtually 
indexed caching really matters enough for this to be an issue at all.

		Linus

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/3] Fix COW D-cache aliasing on fork
  2006-10-21  2:37                                   ` Linus Torvalds
@ 2006-10-21  2:46                                     ` David Miller
  2006-10-21 18:27                                     ` Ralf Baechle
  2006-10-22  1:34                                     ` Ralf Baechle
  2 siblings, 0 replies; 26+ messages in thread
From: David Miller @ 2006-10-21  2:46 UTC (permalink / raw)
  To: torvalds
  Cc: ralf, nickpiggin, akpm, linux-kernel, anemo, linux-arch,
	schwidefsky, James.Bottomley

From: Linus Torvalds <torvalds@osdl.org>
Date: Fri, 20 Oct 2006 19:37:24 -0700 (PDT)

> On Fri, 20 Oct 2006, David Miller wrote:
> > I think it could help for sun4m highmem configs.
> 
> Well, if you can re-create the performance numbers (Ralf - can you send 
> the full series with the final "remove the now unnecessary flush" to 
> Davem?), that will make deciding things easier, I think.
> 
> I suspect sparc, mips and arm are the main architectures where virtually 
> indexed caching really matters enough for this to be an issue at all.

Unfortunately, I don't have any sparc 32-bit systems any more,
so I can't really help out here.  I just make sure the build
keeps working :-)

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/3] Fix COW D-cache aliasing on fork
  2006-10-20 21:41                       ` Ralf Baechle
@ 2006-10-21 16:28                         ` Atsushi Nemoto
  0 siblings, 0 replies; 26+ messages in thread
From: Atsushi Nemoto @ 2006-10-21 16:28 UTC (permalink / raw)
  To: ralf
  Cc: torvalds, rmk+lkml, davem, nickpiggin, akpm, linux-kernel,
	linux-arch, schwidefsky, James.Bottomley

On Fri, 20 Oct 2006 22:41:22 +0100, Ralf Baechle <ralf@linux-mips.org> wrote:
> > It's just that we weren't quite careful enough at that time (and even 
> > then, that would only matter for some really really unlikely and strange 
> > situations that only happen when you fork() from a _threaded_ environment, 
> > so it shouldn't be anything you'd notice under normal load).
> > 
> > I think.
> 
> The flush is there since a very long time.  I have it in my tree since
> ~ 2.1.36 and I get the feeling anybody every has been seriously revisited
> the issue since.

I think calling fork() (or system() or popen() or so) in threaded
program is neither very unlikely or strange.  But this breakage happens
very rarely indeed, especially non-preemptive kernel.

During debugging this issue, I had used this test program and slightly
modified kernel --- inserting yield() at middle of dup_mmap().

With the modified kernel on 32KB VIPT D$, running this test program
some times could reproduce the breakage ("BAD!" messages).  I heard
PARISC people had successed to reproduce it too.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <pthread.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>

static void *thread_func(void *arg)
{
	unsigned char buf[2048], j;
	int i;
	for (j = 0; ; j++) {
		/* fill buf[] with j */
		memset(buf, j, sizeof(buf)/2);
		sched_yield();
		memset(buf + sizeof(buf)/2, j, sizeof(buf)/2);
		sched_yield();
		/* check buf[] contents */
		for (i = 0; i < sizeof(buf); i++) {
			if (buf[i] != j) {
				printf("BAD! %p (%d != %d)\n",
				       buf + i, buf[i], j);
				exit(1);
			}
		}
	}
}

int main(int argc, char *argv[])
{
	int i;
	pid_t pid;
	pthread_t tid;
	for (i = 0; i < 4; i++)
		pthread_create(&tid, NULL, thread_func, NULL);
	for (i = 0; i < 100; i++) {
		pid = fork();
		if (pid == -1) {
			perror("fork");
			exit(1);
		}
		if (pid)
			waitpid(pid, NULL, 0);
		else
			exit(0);
	}
	return 0;
}

---
Atsushi Nemoto

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/3] Fix COW D-cache aliasing on fork
  2006-10-21  2:37                                   ` Linus Torvalds
  2006-10-21  2:46                                     ` David Miller
@ 2006-10-21 18:27                                     ` Ralf Baechle
  2006-10-22  1:34                                     ` Ralf Baechle
  2 siblings, 0 replies; 26+ messages in thread
From: Ralf Baechle @ 2006-10-21 18:27 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: David Miller, nickpiggin, akpm, linux-kernel, anemo, linux-arch,
	schwidefsky, James.Bottomley

On Fri, Oct 20, 2006 at 07:37:24PM -0700, Linus Torvalds wrote:

> Well, if you can re-create the performance numbers (Ralf - can you send 
> the full series with the final "remove the now unnecessary flush" to 
> Davem?), that will make deciding things easier, I think.
> 
> I suspect sparc, mips and arm are the main architectures where virtually 
> indexed caching really matters enough for this to be an issue at all.

What I was using for my fork benchmark was basically the series as posted
in this thread + the quick hack patch below.

I'll dig up some numbers for the posted patchset and will send them later.

   Ralf

diff --git a/kernel/fork.c b/kernel/fork.c
index 29ebb30..c83d226 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -202,7 +202,6 @@ static inline int dup_mmap(struct mm_str
 	struct mempolicy *pol;
 
 	down_write(&oldmm->mmap_sem);
-	flush_cache_mm(oldmm);
 	/*
 	 * Not linked in yet - no deadlock potential:
 	 */

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/3] Fix COW D-cache aliasing on fork
  2006-10-21  2:37                                   ` Linus Torvalds
  2006-10-21  2:46                                     ` David Miller
  2006-10-21 18:27                                     ` Ralf Baechle
@ 2006-10-22  1:34                                     ` Ralf Baechle
  2 siblings, 0 replies; 26+ messages in thread
From: Ralf Baechle @ 2006-10-22  1:34 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: David Miller, nickpiggin, akpm, linux-kernel, anemo, linux-arch,
	schwidefsky, James.Bottomley

On Fri, Oct 20, 2006 at 07:37:24PM -0700, Linus Torvalds wrote:

> Well, if you can re-create the performance numbers (Ralf - can you send 
> the full series with the final "remove the now unnecessary flush" to 
> Davem?), that will make deciding things easier, I think.

Blwo are numbers and comments from Atsushi Nemoto on two Toshiba TY49
cores with 16K rsp. 32K per primary cache.  Each lmbench run was repeated
twice.  The numbers taken without the flush_cache_mm hack to dup_mmap(),
so there are those 12% on fork which can easily be obtained in addition
on PIVT caches such as the TX49.

Processor, Processes - times in microseconds - smaller is better
------------------------------------------------------------------------------
Host                 OS  Mhz null null      open slct sig  sig  fork exec sh
                             call  I/O stat clos TCP  inst hndl proc proc proc
--------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
TX49-16K  without-patch  197 0.72 2.40 17.7 34.3 82.9 2.84 26.2 2500 9364 39.K
TX49-16K  without-patch  197 0.73 2.40 17.7 34.4 82.9 2.85 26.1 2495 9337 39.K
TX49-16K  without-patch  197 0.72 2.40 17.8 34.3 82.9 2.85 26.1 2501 9341 39.K
TX49-16K  with-patch     197 0.72 2.39 20.1 31.9 82.9 2.85 20.2 2491 9101 38.K
TX49-16K  with-patch     197 0.72 2.39 20.1 32.8 82.9 2.86 20.2 2496 9058 38.K
TX49-16K  with-patch     197 0.72 2.39 20.1 32.8 82.9 2.85 20.3 2501 9074 38.K
TX49-32K  without-patch  396 0.36 1.19 6.78 11.3 41.0 1.41 8.15 1246 4674 19.K
TX49-32K  without-patch  396 0.36 1.19 6.78 11.3 41.0 1.41 8.17 1251 4680 19.K
TX49-32K  without-patch  396 0.36 1.19 6.79 11.3 41.0 1.41 8.15 1250 4682 19.K
TX49-32K  with-patch     396 0.36 1.19 6.79 10.2 41.0 1.41 8.14 1230 4638 19.K
TX49-32K  with-patch     396 0.36 1.19 6.78 10.2 40.9 1.41 8.14 1241 4628 19.K
TX49-32K  with-patch     396 0.36 1.19 6.79 10.2 40.9 1.41 8.14 1238 4627 19.K

A little bit faster on exec/proc and open/close (why?).  Strange
results on sig/hndl and stat on TX49-16K again.


Context switching - times in microseconds - smaller is better
-------------------------------------------------------------------------
Host                 OS  2p/0K 2p/16K 2p/64K 8p/16K 8p/64K 16p/16K 16p/64K
                         ctxsw  ctxsw  ctxsw ctxsw  ctxsw   ctxsw   ctxsw
--------- ------------- ------ ------ ------ ------ ------ ------- -------
TX49-16K  without-patch 4.7800   87.1   36.5   97.2   47.8   101.1    40.8
TX49-16K  without-patch 5.4000   88.4   28.9   96.2   39.6   101.2    40.8
TX49-16K  without-patch 4.6800   84.5   32.7   96.8   46.5   100.2    42.9
TX49-16K  with-patch    2.4600   82.7   34.0   93.5   50.5    97.2    43.3
TX49-16K  with-patch    1.5200   87.7   33.6   95.1   42.4    99.7    43.6
TX49-16K  with-patch    1.7700   86.2   34.1   95.8   49.0    99.2    41.7
TX49-32K  without-patch          31.4   11.3   72.1   15.2    72.3    16.9
TX49-32K  without-patch          30.4   11.6   72.2   16.1    73.2    15.1
TX49-32K  without-patch          33.5   12.1   71.2   15.5    73.0    17.0
TX49-32K  with-patch             30.9   11.5   72.3   17.4    73.1    17.5
TX49-32K  with-patch             31.5   11.9   71.8   15.8    73.0    16.7
TX49-32K  with-patch             32.5   10.4   71.7   16.0    72.5    16.6

No noticeable differences.


File & VM system latencies in microseconds - smaller is better
-------------------------------------------------------------------------------
Host                 OS   0K File      10K File     Mmap    Prot   Page   100fd
                        Create Delete Create Delete Latency Fault  Fault  selct
--------- ------------- ------ ------ ------ ------ ------- ----- ------- -----
TX49-16K  without-patch  251.8  192.5 1212.1  397.6   500.0 4.388 7.32710  50.5
TX49-16K  without-patch  254.7  193.9 1197.6  394.9   505.0 4.412 7.34230  50.5
TX49-16K  without-patch  252.6  193.6 1212.1  399.4   499.0 4.758 7.33790  50.5
TX49-16K  with-patch     251.8  192.2 1207.7  391.7   502.0 0.143 7.32320  50.5
TX49-16K  with-patch     252.7  194.0 1200.5  393.7   505.0 0.108 7.32030  50.5
TX49-16K  with-patch     252.0  191.8 1199.0  392.3   508.0 0.011 7.33150  50.5
TX49-32K  without-patch   86.0   54.8  461.3  146.3   378.0 1.818 5.45460  25.0
TX49-32K  without-patch   86.5   54.1  454.3  148.1   378.0 1.816 5.47120  25.0
TX49-32K  without-patch   86.7   53.8  458.1  148.0   378.0 2.130 5.48540  25.0
TX49-32K  with-patch      90.4   52.5  460.8  148.7   377.0 0.471 5.46340  25.0
TX49-32K  with-patch      88.8   52.6  462.5  148.6   380.0 0.476 5.44630  25.0
TX49-32K  with-patch      88.7   52.9  466.4  147.8   378.0 0.477 5.49560  25.0

Major improvements on Prot/Fault.

  Ralf

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/3] Fix COW D-cache aliasing on fork
  2006-10-20 20:10                 ` Linus Torvalds
  2006-10-20 20:59                   ` Russell King
  2006-10-20 21:49                   ` Ralf Baechle
@ 2006-10-23  8:50                   ` Martin Schwidefsky
  2 siblings, 0 replies; 26+ messages in thread
From: Martin Schwidefsky @ 2006-10-23  8:50 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: David Miller, nickpiggin, ralf, Andrew Morton,
	Linux Kernel Mailing List, anemo, linux-arch

On Fri, 2006-10-20 at 13:10 -0700, Linus Torvalds wrote:
> 
> On Fri, 20 Oct 2006, David Miller wrote:
> > 
> > I did some more digging, here's what I think the hardware actually
> > does:
> 
> Ok, this sounds sane.
> 
> What should we do about this? How does this patch look to people?
> 
> (Totally untested, and I'm not sure we should even do that whole 
> "oldmm->mm_users" test, but I'm throwing it out here for discussion, in 
> case it matters for performance. The second D$ flush should obviously be 
> unnecessary for the common unthreaded case, which is why none of this has 
> mattered historically, I think).
> 
> Comments? We need ARM, MIPS, sparc and S390 at the very least to sign off 
> on this, and somebody to write a nice explanation for the changelog (and 
> preferably do this through -mm too).

On s390 you never have to worry about cache flushing. It is not stated
anywhere in the principles of operation but the architecture has to be
PIPT, otherwise it couldn't possibly work. The best indication for it is
that there is no cache flush instruction. The view on data in memory is
always consistent.

-- 
blue skies,
  Martin.

Martin Schwidefsky
Linux for zSeries Development & Services
IBM Deutschland Entwicklung GmbH

"Reality continues to ruin my life." - Calvin.



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/3] Fix COW D-cache aliasing on fork
  2006-10-21  0:38                               ` Linus Torvalds
  2006-10-21  1:29                                 ` Paul Mackerras
  2006-10-21  2:11                                 ` David Miller
@ 2006-12-02  9:49                                 ` Russell King
  2 siblings, 0 replies; 26+ messages in thread
From: Russell King @ 2006-12-02  9:49 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Ralf Baechle, David Miller, nickpiggin, akpm, linux-kernel, anemo,
	linux-arch, schwidefsky, James.Bottomley

On Fri, Oct 20, 2006 at 05:38:32PM -0700, Linus Torvalds wrote:
> On Sat, 21 Oct 2006, Ralf Baechle wrote:
> > > That said, maybe nobody does that. Virtual caches are a total braindamage 
> > > in the first place, so hopefully they have limited use.
> > 
> > On MIPS we never had pure virtual caches.
> 
> Ok, so on MIPS my schenario doesn't matter.
> 
> I think (but may be mistaken) that ARM _does_ have pure virtual caches 
> with a process ID, but people have always ended up flushing them at 
> context switch simply because it just causes too much trouble.

Just read this, sorry.

The majority of ARM CPU implementations have pure virtual caches
_without_ process IDs.  (Some have a nasty hack which involves
remapping the lower 32MB of virtual memory space to other areas
of the cache's virtual space, but obviously that limits you to
32MB of VM.)

Thankfully, with ARM version 6, they had an inkling of clue, and
decided to move to VIPT caches but with _optional_ aliasing, and
if the CPU design was Harvard there's a possibility for D/I cache
aliasing.

> > Be sure I'm sending a CPU designers a strong message about aliases.
> 
> Castration. That's the best solution. We don't want those people 
> procreating.

Absolutely.  Can we start such a program in Cambridge, England ASAP
please?

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2006-12-02  9:49 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <1161275748231-git-send-email-ralf@linux-mips.org>
     [not found] ` <4537B9FB.7050303@yahoo.com.au>
     [not found]   ` <20061019181346.GA5421@linux-mips.org>
     [not found]     ` <20061019.155939.48528489.davem@davemloft.net>
     [not found]       ` <4538DFAC.1090206@yahoo.com.au>
     [not found]         ` <Pine.LNX.4.64.0610200846260.3962@g5.osdl.org>
     [not found]           ` <4538F1EC.1020806@yahoo.com.au>
     [not found]             ` <Pine.LNX.4.64.0610200935290.3962@g5.osdl.org>
     [not found]               ` <4538FDBC.6070301@yahoo.com.au>
2006-10-20 17:16                 ` [PATCH 1/3] Fix COW D-cache aliasing on fork Linus Torvalds
2006-10-20 17:37                   ` Nick Piggin
     [not found]           ` <20061020.123635.95058911.davem@davemloft.net>
     [not found]             ` <Pine.LNX.4.64.0610201251440.3962@g5.osdl.org>
     [not found]               ` <20061020.125851.115909797.davem@davemloft.net>
2006-10-20 20:10                 ` Linus Torvalds
2006-10-20 20:59                   ` Russell King
2006-10-20 21:06                     ` David Miller
2006-10-20 21:17                       ` Russell King
2006-10-20 21:30                         ` David Miller
2006-10-20 21:12                     ` Linus Torvalds
2006-10-20 21:28                       ` Russell King
2006-10-20 21:41                       ` Ralf Baechle
2006-10-21 16:28                         ` Atsushi Nemoto
2006-10-20 21:49                   ` Ralf Baechle
2006-10-20 22:02                     ` Linus Torvalds
2006-10-20 22:22                       ` David Miller
2006-10-20 22:51                         ` Ralf Baechle
2006-10-20 23:28                           ` Linus Torvalds
2006-10-21  0:06                             ` Ralf Baechle
2006-10-21  0:38                               ` Linus Torvalds
2006-10-21  1:29                                 ` Paul Mackerras
2006-10-21  2:11                                 ` David Miller
2006-10-21  2:37                                   ` Linus Torvalds
2006-10-21  2:46                                     ` David Miller
2006-10-21 18:27                                     ` Ralf Baechle
2006-10-22  1:34                                     ` Ralf Baechle
2006-12-02  9:49                                 ` Russell King
2006-10-23  8:50                   ` Martin Schwidefsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).