Linux PARISC architecture development
 help / color / mirror / Atom feed
* [parisc-linux] Re: RFC: mmap patch
@ 2003-03-06 14:14 John Marvin
  2003-03-06 14:31 ` Matthew Wilcox
  2003-03-08  6:30 ` Grant Grundler
  0 siblings, 2 replies; 27+ messages in thread
From: John Marvin @ 2003-03-06 14:14 UTC (permalink / raw)
  To: parisc-linux; +Cc: davem

>
> Looking for some vm gurus to verify that this is correct. This fixes a
> problem where the kernel-view of a page vs the user-view can get out of
> sync because of the virtually tagged cache of PA. This fixes a bug
> reported some time back where write(2)s to a mmaped fd doesn't get
> updated properly in the mmap'ed region.
>
> Many thanks to willy for his advice in putting this together :)
>
> This is tested on 2.5.64-pa1 on a500, but should also apply to 2.4.
>
> Any comments before I apply?
>

In my opinion this patch is a hack workaround for a real bug. parisc is
not the only architecture that has virtual tagged caches.  Some mips
machines, sparc and ultrasparc machines have virtual tagged caches,
although none of them have virtual tagged caches as large as parisc (the
other architectures typically have a larger physical cache at a higher
level in the cache hierarchy).

Dave Miller designed the cache flushing strategy to have hooks in the
machine independent code to support virtual tagged caches.  Probably there
is simply a cache flush that is missing that doesn't show itself as a bug
as easily on the smaller virtually tagged caches of the other
architectures. At most this particular scenario wasn't considered
for virtual tagged caches (maintaining coherence between fd writes and
mmap'ed regions) and will require a design change to fix.

The fix for this bug should be made in machine independent code, not in
our machine dependent code.

John Marvin
jsm@fc.hp.com

P.S. I am cc'ing Dave Miller to see if he cares to comment.

> Index: include/asm-parisc/cacheflush.h
> ===================================================================
> RCS file: /var/cvs/linux-2.5/include/asm-parisc/cacheflush.h,v
> retrieving revision 1.5
> diff -u -p -r1.5 cacheflush.h
> --- include/asm-parisc/cacheflush.h     6 Mar 2003 04:20:45 -0000       1.5
> +++ include/asm-parisc/cacheflush.h     6 Mar 2003 06:35:41 -0000
> @@ -67,13 +67,15 @@ flush_user_icache_range(unsigned long st
>  #endif
>  }
>
> +extern void __flush_dcache_page(struct page *page);
> +
>  static inline void flush_dcache_page(struct page *page)
>  {
>         if (page->mapping && list_empty(&page->mapping->i_mmap) &&
>                         list_empty(&page->mapping->i_mmap_shared)) {
>                 set_bit(PG_dcache_dirty, &page->flags);
>         } else {
> -               flush_kernel_dcache_page(page_address(page));
> +               __flush_dcache_page(page);
>         }
>  }
>
> Index: arch/parisc/kernel/cache.c
> ===================================================================
> RCS file: /var/cvs/linux-2.5/arch/parisc/kernel/cache.c,v
> retrieving revision 1.9
> diff -u -p -r1.9 cache.c
> --- arch/parisc/kernel/cache.c  5 Mar 2003 21:15:37 -0000       1.9
> +++ arch/parisc/kernel/cache.c  6 Mar 2003 06:35:41 -0000
> @@ -222,3 +222,42 @@ void disable_sr_hashing(void)
>
>         disable_sr_hashing_asm(srhash_type);
>  }
> +
> +void __flush_dcache_page(struct page *page)
> +{
> +       struct mm_struct *mm = current->active_mm;
> +       struct list_head *l;
> +
> +       flush_kernel_dcache_page(page_address(page));
> +
> +       if (!page->mapping)
> +               return;
> +
> +       list_for_each(l, &page->mapping->i_mmap_shared) {
> +               struct vm_area_struct *mpnt;
> +               unsigned long off;
> +
> +               mpnt = list_entry(l, struct vm_area_struct, shared);
> +
> +               /*
> +                * If this VMA is not in our MM, we can ignore it.
> +                */
> +               if (mpnt->vm_mm != mm)
> +                       continue;
> +
> +               if (page->index < mpnt->vm_pgoff)
> +                       continue;
> +
> +               off = page->index - mpnt->vm_pgoff;
> +               if (off >= (mpnt->vm_end - mpnt->vm_start) >> PAGE_SHIFT)
> +                       continue;
> +
> +               flush_cache_page(mpnt, mpnt->vm_start + (off << PAGE_SHIFT));
> +
> +               /* All user shared mappings should be equivalently mapped,
> +                * so once we've flushed one we should be ok
> +                */
> +               break;
> +       }
> +}
> +
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [parisc-linux] Re: RFC: mmap patch
  2003-03-06 14:14 [parisc-linux] Re: RFC: mmap patch John Marvin
@ 2003-03-06 14:31 ` Matthew Wilcox
  2003-03-06 15:31   ` Randolph Chung
  2003-03-08  6:30 ` Grant Grundler
  1 sibling, 1 reply; 27+ messages in thread
From: Matthew Wilcox @ 2003-03-06 14:31 UTC (permalink / raw)
  To: John Marvin; +Cc: parisc-linux, davem

On Thu, Mar 06, 2003 at 07:14:53AM -0700, John Marvin wrote:
> In my opinion this patch is a hack workaround for a real bug. parisc is
> not the only architecture that has virtual tagged caches.  Some mips
> machines, sparc and ultrasparc machines have virtual tagged caches,
> although none of them have virtual tagged caches as large as parisc (the
> other architectures typically have a larger physical cache at a higher
> level in the cache hierarchy).
> 
> Dave Miller designed the cache flushing strategy to have hooks in the
> machine independent code to support virtual tagged caches.  Probably there
> is simply a cache flush that is missing that doesn't show itself as a bug
> as easily on the smaller virtually tagged caches of the other
> architectures. At most this particular scenario wasn't considered
> for virtual tagged caches (maintaining coherence between fd writes and
> mmap'ed regions) and will require a design change to fix.
> 
> The fix for this bug should be made in machine independent code, not in
> our machine dependent code.

Unfortunately, the flush is in the right place according to the
definition.  Here's how it looks in 2.4 (2.5 is more complex but has
essentially the same flush in it):

generic_file_write(struct file *file,const char *buf,size_t count, loff_t *ppos)
{
[...]
                kaddr = kmap(page);
                status = mapping->a_ops->prepare_write(file, page, offset, offse
t+bytes);
                if (status)
                        goto sync_failure;
                page_fault = __copy_from_user(kaddr+offset, buf, bytes);
                flush_dcache_page(page);
                status = mapping->a_ops->commit_write(file, page, offset, offset
+bytes);

So our flush_dcache_page() flushes the kernel's view of that page,
no problem.  Memory now has the right contents.  But the dcache still
has the stale data in it for the user's mapping of the same page.

There's a few ways to fix it.

1) This patch tausq did.  Now we flush both user & kernel mappings for the
data.

2) Split flush_dcache_page into flush_dcache_page_user_mapping() and
flush_dcache_page_kernel_mapping().  Hopefully with better names.

3) (ab)use kmap to hand back an address which virtually aliases the user
mapping.  flush_dcache_page() would only have to writeback to memory if
the page was not mapped.

-- 
"It's not Hollywood.  War is real, war is primarily not about defeat or
victory, it is about death.  I've seen thousands and thousands of dead bodies.
Do you think I want to have an academic debate on this subject?" -- Robert Fisk

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [parisc-linux] Re: RFC: mmap patch
  2003-03-06 14:31 ` Matthew Wilcox
@ 2003-03-06 15:31   ` Randolph Chung
  0 siblings, 0 replies; 27+ messages in thread
From: Randolph Chung @ 2003-03-06 15:31 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: John Marvin, parisc-linux, davem

In reference to a message from Matthew Wilcox, dated Mar 06:
> > In my opinion this patch is a hack workaround for a real bug. parisc is
> > not the only architecture that has virtual tagged caches.  Some mips
> > machines, sparc and ultrasparc machines have virtual tagged caches,
> > although none of them have virtual tagged caches as large as parisc (the
> > other architectures typically have a larger physical cache at a higher
[...]
> 1) This patch tausq did.  Now we flush both user & kernel mappings for the
> data.

I should mention that this is almost exactly what arm does as well.

randolph
-- 
Randolph Chung
Debian GNU/Linux Developer, hppa/ia64 ports
http://www.tausq.org/

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [parisc-linux] Re: RFC: mmap patch
  2003-03-08  6:30 ` Grant Grundler
@ 2003-03-08  6:29   ` David S. Miller
  2003-03-08 17:24     ` Grant Grundler
  2003-03-08 23:11     ` Matthew Wilcox
  0 siblings, 2 replies; 27+ messages in thread
From: David S. Miller @ 2003-03-08  6:29 UTC (permalink / raw)
  To: grundler; +Cc: jsm, parisc-linux

   From: grundler@dsl2.external.hp.com (Grant Grundler)
   Date: Fri, 7 Mar 2003 23:30:43 -0700
   
   I don't pretend to understand the issues with virtual tags and
   linux VM design, I just want to encourage the discussion since
   I'd really like to see SMP work right on parisc.

If you flush caches exactly what sparc64 does in 2.5.x, and you do
have a virtually indexed, physically tagged cache, you should have no
correctness.  I've stressed that port to no end (in particular the LTP
suite has a great mmap/read/write coherency tester), and if there are
holes I'd like to know about them :-)

The sparc64 port only flushes when absolutely necessary.

The most crucial area to get efficient flushing is the
{copy,clear}_user_page implementation.  If you use temporary kernel
mappings mapped at virtual addresses matching the virtual color that
the user's mappings will have, this avoids virtually ALL of the
flushing for anonymous pages.  I haven't noticed too many ports pick
up this trick even though I mention it in cachetlb.txt.

On sparc64 I even save the original TLB entries before the flush
and restore them afterwards, so there is no TLB traffic as a result
of doing these temp mappings.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [parisc-linux] Re: RFC: mmap patch
  2003-03-06 14:14 [parisc-linux] Re: RFC: mmap patch John Marvin
  2003-03-06 14:31 ` Matthew Wilcox
@ 2003-03-08  6:30 ` Grant Grundler
  2003-03-08  6:29   ` David S. Miller
  1 sibling, 1 reply; 27+ messages in thread
From: Grant Grundler @ 2003-03-08  6:30 UTC (permalink / raw)
  To: John Marvin; +Cc: parisc-linux, davem

On Thu, Mar 06, 2003 at 07:14:53AM -0700, John Marvin wrote:
> Dave Miller designed the cache flushing strategy to have hooks in the
> machine independent code to support virtual tagged caches.

While Dave Miller has my respect, he doesn't get everything exactly
right on the first try.  Design review and feedback are needed
(eg pci_set_dma_mask()) to fix remaining issues with the design
(e.g. nits/extensions to PCI DMA support).

My point is the VM design may have "flaws" WRT parisc architecture.
If the existing hooks mean performance sucks on parisc, davem is
usually open (though a PITA to convince) to discussing the issues.

I don't pretend to understand the issues with virtual tags and
linux VM design, I just want to encourage the discussion since
I'd really like to see SMP work right on parisc.

grant

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [parisc-linux] Re: RFC: mmap patch
  2003-03-08  6:29   ` David S. Miller
@ 2003-03-08 17:24     ` Grant Grundler
  2003-03-08 19:04       ` David S. Miller
  2003-03-08 23:11     ` Matthew Wilcox
  1 sibling, 1 reply; 27+ messages in thread
From: Grant Grundler @ 2003-03-08 17:24 UTC (permalink / raw)
  To: David S. Miller; +Cc: jsm, parisc-linux

On Fri, Mar 07, 2003 at 10:29:45PM -0800, David S. Miller wrote:
> If you flush caches exactly what sparc64 does in 2.5.x, and you do
> have a virtually indexed, physically tagged cache, you should have no
> correctness.

After getting some sleep and thinking about how IOMMUs work on parisc,
I've convinced myself the CPU caches are virtually tagged and
virtually indexed. Part of the IOMMU function is to provide the
virtual address tag/index bits in order to acquire cacheline ownership
on behalf of the device doing DMA. I don't know exactly how that
matters to the VM design, but it seems relevant that parisc is not
physically indexed.

thanks,
grant

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [parisc-linux] Re: RFC: mmap patch
  2003-03-08 17:24     ` Grant Grundler
@ 2003-03-08 19:04       ` David S. Miller
  2003-03-08 20:42         ` Grant Grundler
  2003-03-08 22:45         ` Matthew Wilcox
  0 siblings, 2 replies; 27+ messages in thread
From: David S. Miller @ 2003-03-08 19:04 UTC (permalink / raw)
  To: grundler; +Cc: jsm, parisc-linux

   From: grundler@dsl2.external.hp.com (Grant Grundler)
   Date: Sat, 8 Mar 2003 10:24:39 -0700
   
   After getting some sleep and thinking about how IOMMUs work on parisc,
   I've convinced myself the CPU caches are virtually tagged and
   virtually indexed.

Some quick googling suggests that they are physically tagged.

For example, see section 1.1 of the following paper:

	http://www.cs.washington.edu/homes/bershad/Papers/asplosVM.ps

It's an interesting paper btw too :-)

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [parisc-linux] Re: RFC: mmap patch
  2003-03-08 19:04       ` David S. Miller
@ 2003-03-08 20:42         ` Grant Grundler
  2003-03-08 22:45         ` Matthew Wilcox
  1 sibling, 0 replies; 27+ messages in thread
From: Grant Grundler @ 2003-03-08 20:42 UTC (permalink / raw)
  To: David S. Miller; +Cc: jsm, parisc-linux

On Sat, Mar 08, 2003 at 11:04:59AM -0800, David S. Miller wrote:
> Some quick googling suggests that they are physically tagged.
> 
> For example, see section 1.1 of the following paper:
> 
> 	http://www.cs.washington.edu/homes/bershad/Papers/asplosVM.ps

yes, I'm just confused as usual.

> It's an interesting paper btw too :-)

ok - thanks.
It's not something I would have looked for normally.
I'll read it though to understand the problem better.

grant

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [parisc-linux] Re: RFC: mmap patch
  2003-03-08 19:04       ` David S. Miller
  2003-03-08 20:42         ` Grant Grundler
@ 2003-03-08 22:45         ` Matthew Wilcox
  2003-03-08 23:00           ` David S. Miller
  1 sibling, 1 reply; 27+ messages in thread
From: Matthew Wilcox @ 2003-03-08 22:45 UTC (permalink / raw)
  To: David S. Miller; +Cc: grundler, jsm, parisc-linux

On Sat, Mar 08, 2003 at 11:04:59AM -0800, David S. Miller wrote:
>    From: grundler@dsl2.external.hp.com (Grant Grundler)
>    Date: Sat, 8 Mar 2003 10:24:39 -0700
>    
>    After getting some sleep and thinking about how IOMMUs work on parisc,
>    I've convinced myself the CPU caches are virtually tagged and
>    virtually indexed.
> 
> Some quick googling suggests that they are physically tagged.

I'm not sure of the exact definitions of physically/virtually tagged.
The definition you're interested in is on Page F-6 of the PARISC 2.0
Architecture book by Gerry Kane.  Fortunately HP have it online at the
rather ugly URL:

http://h21007.www2.hp.com/dspp/tech/tech_TechDocumentDetailPage_IDX/1,1701,959!218!244,00.html


	The instruction and data caches are required to detect that the
	same physical memory location is being accessed by two virtual
	addresses that satisfy all the following requirements:

	   1. The two virtual addresses map to the same absolute address.
	   2. Offset bits 40 through 63 are the same in both virtual addresses.

The upshot is that if two addresses are congruent modulo 4MB and they
map to the same physical address, the cache will detect it.  I've been
thinking about (ab)using kmap() for this for a while.  The trouble
is we'd need to have 1024 slots just to be guaranteed space to map 1
page -- if we need to guarantee to be able to map two pages at once,
we need 2048 slots (ie 8MB of virtual space).  Etc.  I have no idea how
many pages we expect to be able to map simultaneously, and haven't been
able to get a straight answer out of anyone so far.

-- 
"It's not Hollywood.  War is real, war is primarily not about defeat or
victory, it is about death.  I've seen thousands and thousands of dead bodies.
Do you think I want to have an academic debate on this subject?" -- Robert Fisk

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [parisc-linux] Re: RFC: mmap patch
  2003-03-08 22:45         ` Matthew Wilcox
@ 2003-03-08 23:00           ` David S. Miller
  2003-03-08 23:27             ` Matthew Wilcox
                               ` (2 more replies)
  0 siblings, 3 replies; 27+ messages in thread
From: David S. Miller @ 2003-03-08 23:00 UTC (permalink / raw)
  To: willy; +Cc: grundler, jsm, parisc-linux

   From: Matthew Wilcox <willy@debian.org>
   Date: Sat, 8 Mar 2003 22:45:03 +0000
   
   The upshot is that if two addresses are congruent modulo 4MB and they
   map to the same physical address, the cache will detect it.  I've been
   thinking about (ab)using kmap() for this for a while.  The trouble
   is we'd need to have 1024 slots just to be guaranteed space to map 1
   page -- if we need to guarantee to be able to map two pages at once,
   we need 2048 slots (ie 8MB of virtual space).  Etc.  I have no idea how
   many pages we expect to be able to map simultaneously, and haven't been
   able to get a straight answer out of anyone so far.

You need merely 8MB of address space (2 * 4MB) if you implement
my {copy,clear}_user_page() dynamic mapping hack, that will be
tons more cheaper than any kmap based scheme and also be nicer
on the TLB as there will be zero TLB changes occurring around
the copy/clear.

People can continue to talk about all their bright new idea, which
is fine, but it feels like my known-working ideas are being ignored.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [parisc-linux] Re: RFC: mmap patch
  2003-03-08 23:11     ` Matthew Wilcox
@ 2003-03-08 23:02       ` David S. Miller
  2003-03-09 14:42         ` Matthew Wilcox
  0 siblings, 1 reply; 27+ messages in thread
From: David S. Miller @ 2003-03-08 23:02 UTC (permalink / raw)
  To: willy; +Cc: grundler, jsm, parisc-linux

   From: Matthew Wilcox <willy@debian.org>
   Date: Sat, 8 Mar 2003 23:11:25 +0000

   One other difference you forgot is that Sparc64 has writethrough caches
   and PA has writeback caches.  This can make a difference.
   
It certainly could, but I think sparc64's schemes will work.
Please document any differences you discover, preferably with
text to go into Documentation/cachetlb.txt

The big variable is whether the cache lines are tagged with
physical addresses or not, that _does_ make a known difference.

But, it appears that so far all information suggests that
PA Risc uses physical tags.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [parisc-linux] Re: RFC: mmap patch
  2003-03-08  6:29   ` David S. Miller
  2003-03-08 17:24     ` Grant Grundler
@ 2003-03-08 23:11     ` Matthew Wilcox
  2003-03-08 23:02       ` David S. Miller
  1 sibling, 1 reply; 27+ messages in thread
From: Matthew Wilcox @ 2003-03-08 23:11 UTC (permalink / raw)
  To: David S. Miller; +Cc: grundler, jsm, parisc-linux

On Fri, Mar 07, 2003 at 10:29:45PM -0800, David S. Miller wrote:
> If you flush caches exactly what sparc64 does in 2.5.x, and you do
> have a virtually indexed, physically tagged cache, you should have no
> correctness.  I've stressed that port to no end (in particular the LTP
> suite has a great mmap/read/write coherency tester), and if there are
> holes I'd like to know about them :-)

One other difference you forgot is that Sparc64 has writethrough caches
and PA has writeback caches.  This can make a difference.

> The most crucial area to get efficient flushing is the
> {copy,clear}_user_page implementation.  If you use temporary kernel
> mappings mapped at virtual addresses matching the virtual color that
> the user's mappings will have, this avoids virtually ALL of the
> flushing for anonymous pages.  I haven't noticed too many ports pick
> up this trick even though I mention it in cachetlb.txt.

I've certainly been thinking about doing it, but you're right we don't
do it yet.

> On sparc64 I even save the original TLB entries before the flush
> and restore them afterwards, so there is no TLB traffic as a result
> of doing these temp mappings.

Nice trick.

-- 
"It's not Hollywood.  War is real, war is primarily not about defeat or
victory, it is about death.  I've seen thousands and thousands of dead bodies.
Do you think I want to have an academic debate on this subject?" -- Robert Fisk

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [parisc-linux] Re: RFC: mmap patch
  2003-03-08 23:27             ` Matthew Wilcox
@ 2003-03-08 23:14               ` David S. Miller
  0 siblings, 0 replies; 27+ messages in thread
From: David S. Miller @ 2003-03-08 23:14 UTC (permalink / raw)
  To: willy; +Cc: grundler, jsm, parisc-linux

	Sorry, I think you've misunderstood the problem.  This is
	write() vs mmap().  We call flush_dcache_page() after the
	write() -- but PA's flush_dcache_page() only flushes the
	kernel's mapping of that page, not the user's.  The question
	is how to fix that.

Right, flush_dcache_page() must get rid of all instances of
the underlying physical data from the cache.  This means all
kernel and user virtual mappings of the page.

Chips with a "flush physical page X" scheme can just do that,
others can flush the kernel virtual mapping then walk the
VMA list of user mappings and hit those one by one.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [parisc-linux] Re: RFC: mmap patch
  2003-03-08 23:31             ` Randolph Chung
@ 2003-03-08 23:15               ` David S. Miller
  0 siblings, 0 replies; 27+ messages in thread
From: David S. Miller @ 2003-03-08 23:15 UTC (permalink / raw)
  To: tausq; +Cc: willy, grundler, jsm, parisc-linux

   From: Randolph Chung <tausq@debian.org>
   Date: Sat, 8 Mar 2003 15:31:41 -0800

   > You need merely 9MB of address space (2 * 4MB) if you implement
   > my {copy,clear}_user_page() dynamic mapping hack, that will be
   > tons more cheaper than any kmap based scheme and also be nicer
   > on the TLB as there will be zero TLB changes occurring around
   > the copy/clear.
   
   i hope i'm not missing something obvious, but from what i understand
   from this discussion, {copy,user}_user_page helps with the initial
   syncing up of new mappings that are created for a process... but how
   does this deal with syncing up changes afterwards? for that we still
   need the flush_dcache_page to walk the user mappings right?
   
Yes, the {copy,clear}_user_page() thing is an optimization for
handling of cache alias issues wrt. anonymous pages.

The PA-RISC mmap/write bug is something entirely different,
flush_dcache_page() just isn't doing what it is defined to
do :-)

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [parisc-linux] Re: RFC: mmap patch
  2003-03-08 23:00           ` David S. Miller
@ 2003-03-08 23:27             ` Matthew Wilcox
  2003-03-08 23:14               ` David S. Miller
  2003-03-08 23:31             ` Randolph Chung
  2003-03-09  2:15             ` Grant Grundler
  2 siblings, 1 reply; 27+ messages in thread
From: Matthew Wilcox @ 2003-03-08 23:27 UTC (permalink / raw)
  To: David S. Miller; +Cc: willy, grundler, jsm, parisc-linux

On Sat, Mar 08, 2003 at 03:00:23PM -0800, David S. Miller wrote:
> You need merely 8MB of address space (2 * 4MB) if you implement
> my {copy,clear}_user_page() dynamic mapping hack, that will be
> tons more cheaper than any kmap based scheme and also be nicer
> on the TLB as there will be zero TLB changes occurring around
> the copy/clear.
> 
> People can continue to talk about all their bright new idea, which
> is fine, but it feels like my known-working ideas are being ignored.

Sorry, I think you've misunderstood the problem.  This is write() vs mmap().
We call flush_dcache_page() after the write() -- but PA's flush_dcache_page() only flushes the kernel's mapping of that page, not the user's.  The question is how to fix that.

Randolph published a patch which flushes two pages -- one which is the
kernel's view of the page, and one which is the user's view of the page.
John disagreed that was necessary, and I'm proposing a way (with kmap)
to avoid it.

Nothing to do with copy/clear_user_page().

-- 
"It's not Hollywood.  War is real, war is primarily not about defeat or
victory, it is about death.  I've seen thousands and thousands of dead bodies.
Do you think I want to have an academic debate on this subject?" -- Robert Fisk

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [parisc-linux] Re: RFC: mmap patch
  2003-03-08 23:00           ` David S. Miller
  2003-03-08 23:27             ` Matthew Wilcox
@ 2003-03-08 23:31             ` Randolph Chung
  2003-03-08 23:15               ` David S. Miller
  2003-03-09  2:15             ` Grant Grundler
  2 siblings, 1 reply; 27+ messages in thread
From: Randolph Chung @ 2003-03-08 23:31 UTC (permalink / raw)
  To: David S. Miller; +Cc: willy, grundler, jsm, parisc-linux

> You need merely 9MB of address space (2 * 4MB) if you implement
> my {copy,clear}_user_page() dynamic mapping hack, that will be
> tons more cheaper than any kmap based scheme and also be nicer
> on the TLB as there will be zero TLB changes occurring around
> the copy/clear.

i hope i'm not missing something obvious, but from what i understand
from this discussion, {copy,user}_user_page helps with the initial
syncing up of new mappings that are created for a process... but how
does this deal with syncing up changes afterwards? for that we still
need the flush_dcache_page to walk the user mappings right?

randolph
-- 
Randolph Chung
Debian GNU/Linux Developer, hppa/ia64 ports
http://www.tausq.org/

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [parisc-linux] Re: RFC: mmap patch
  2003-03-08 23:00           ` David S. Miller
  2003-03-08 23:27             ` Matthew Wilcox
  2003-03-08 23:31             ` Randolph Chung
@ 2003-03-09  2:15             ` Grant Grundler
  2 siblings, 0 replies; 27+ messages in thread
From: Grant Grundler @ 2003-03-09  2:15 UTC (permalink / raw)
  To: David S. Miller; +Cc: willy, jsm, parisc-linux

On Sat, Mar 08, 2003 at 03:00:23PM -0800, David S. Miller wrote:
...
> People can continue to talk about all their bright new idea, which
> is fine, but it feels like my known-working ideas are being ignored.

As noted earlier, different problem.

Since about Nov 2002 we have been working on better {copy,clear}_user_page()
implementations quietly inside HP.  Unfortunately some, uhm, "issues" are
taking a long time to resolve before the routines can be published.
Once those issues are resolved, I think your trick with temporary
TLB entries can get implemented.

thanks,
grant

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [parisc-linux] Re: RFC: mmap patch
@ 2003-03-09  3:42 John Marvin
  2003-03-09 21:29 ` David S. Miller
  0 siblings, 1 reply; 27+ messages in thread
From: John Marvin @ 2003-03-09  3:42 UTC (permalink / raw)
  To: parisc-linux; +Cc: davem

>
> Yes, the {copy,clear}_user_page() thing is an optimization for
> handling of cache alias issues wrt. anonymous pages.
>
I already implemented that solution on the parisc port. Well actually
only half :-). I implemented both the copy and clear user page optimizations,
but later on disabled the copy_user_page solution. I can't remember the
exact details of the problem. Based on a comment I wrote, it seems that in
some cases copy_user_page was being used for something that was going to
be executed. I can't remember the exact scenario, since copy_user_page
is only called in copy-on-write situations.

The problem is that since we have split I and D caches, the cache would
need to be flushed anyway so that the instructions could be seen for
execution.  So it didn't make any sense to set up the temporary
translation and then flush it anyway; better to just do it through the
kernel translation and flush.  I believe this isn't a problem on sparc64
because of the write-through cache.

> The PA-RISC mmap/write bug is something entirely different,
> flush_dcache_page() just isn't doing what it is defined to
> do :-)

Ok.  But looking at the sparc64 implementation of flush_dcache_page I
don't see any code to traverse user vma's.  Is there some other
architecture specific trick that is being used to flush any user mappings
that are mapped to the same physical address as the kernel address?

John Marvin
jsm@fc.hp.com

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [parisc-linux] Re: RFC: mmap patch
@ 2003-03-09  3:51 John Marvin
  2003-03-09 21:31 ` David S. Miller
  0 siblings, 1 reply; 27+ messages in thread
From: John Marvin @ 2003-03-09  3:51 UTC (permalink / raw)
  To: parisc-linux; +Cc: davem

On Sat, Mar 08, 2003 at 03:00:23PM -0800, David S. Miller wrote:
> ...
> Since about Nov 2002 we have been working on better {copy,clear}_user_page()
> implementations quietly inside HP.  Unfortunately some, uhm, "issues" are
> taking a long time to resolve before the routines can be published.
> Once those issues are resolved, I think your trick with temporary
> TLB entries can get implemented.
>
The tlb trick for clear_user_page is already implemented. The code for
copy_user_page is there, but turned off due to an issue mentioned in my
last email. I believe the failure case was not a frequent path, and there
are ways of working around that problem.

The address that the copying is done through (e,g. the tlb trick) is
independent of the algorithm used to actually do the copying.

John

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [parisc-linux] Re: RFC: mmap patch
  2003-03-08 23:02       ` David S. Miller
@ 2003-03-09 14:42         ` Matthew Wilcox
  2003-03-09 21:38           ` David S. Miller
  2003-03-14 13:04           ` Jochen Friedrich
  0 siblings, 2 replies; 27+ messages in thread
From: Matthew Wilcox @ 2003-03-09 14:42 UTC (permalink / raw)
  To: David S. Miller; +Cc: willy, grundler, jsm, parisc-linux

On Sat, Mar 08, 2003 at 03:02:14PM -0800, David S. Miller wrote:
>    One other difference you forgot is that Sparc64 has writethrough caches
>    and PA has writeback caches.  This can make a difference.
>    
> It certainly could, but I think sparc64's schemes will work.
> Please document any differences you discover, preferably with
> text to go into Documentation/cachetlb.txt

Well.. one patch which affected us, but wouldn't affect a writethrough
cache is:

http://ftp.linux.org.uk/pub/linux/willy/patches/applied/shmem.diff

In that situation, the user has no view of that page currently so they
will read from memory.  With a writethrough cache it works without the
flush_dcache_page(), with writeback it does not.

-- 
"It's not Hollywood.  War is real, war is primarily not about defeat or
victory, it is about death.  I've seen thousands and thousands of dead bodies.
Do you think I want to have an academic debate on this subject?" -- Robert Fisk

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [parisc-linux] Re: RFC: mmap patch
  2003-03-09  3:42 John Marvin
@ 2003-03-09 21:29 ` David S. Miller
  0 siblings, 0 replies; 27+ messages in thread
From: David S. Miller @ 2003-03-09 21:29 UTC (permalink / raw)
  To: jsm; +Cc: parisc-linux

   From: John Marvin <jsm@udlkern.fc.hp.com>
   Date: Sat, 8 Mar 2003 20:42:04 -0700 (MST)

   The problem is that since we have split I and D caches, the cache would
   need to be flushed anyway so that the instructions could be seen for
   execution.  So it didn't make any sense to set up the temporary
   translation and then flush it anyway; better to just do it through the
   kernel translation and flush.  I believe this isn't a problem on sparc64
   because of the write-through cache.
   
Handled on sparc64, when TIF_BLKCOMMIT flag is set, we use a more
expensive block store instruction in {copy,clear}_user_page() which
while more expensive does keep the instruction cache up to date.

More modent sparc64 chips don't even need this as the store stream
of the cpu is fully snooped by the instruction cache.

   > The PA-RISC mmap/write bug is something entirely different,
   > flush_dcache_page() just isn't doing what it is defined to
   > do :-)
   
   Ok.  But looking at the sparc64 implementation of flush_dcache_page I
   don't see any code to traverse user vma's.  Is there some other
   architecture specific trick that is being used to flush any user mappings
   that are mapped to the same physical address as the kernel address?
   
On sparc64 we can remove an arbitrary "physical" page from the D-cache
in one pass.  It does not matter what virtual address they use.

So it is handled, you just don't understand the hardware :-)

Because the chip has this interface, we need not walk the VMA list.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [parisc-linux] Re: RFC: mmap patch
  2003-03-09  3:51 John Marvin
@ 2003-03-09 21:31 ` David S. Miller
  0 siblings, 0 replies; 27+ messages in thread
From: David S. Miller @ 2003-03-09 21:31 UTC (permalink / raw)
  To: jsm; +Cc: parisc-linux

   From: John Marvin <jsm@udlkern.fc.hp.com>
   Date: Sat, 8 Mar 2003 20:51:55 -0700 (MST)

   The tlb trick for clear_user_page is already implemented. The code for
   copy_user_page is there, but turned off due to an issue mentioned in my
   last email. I believe the failure case was not a frequent path, and there
   are ways of working around that problem.
   
Fix it by adding a thread flag, "TIF_NEED_ICACHE_FLUSH" or whatever,
and at the end of copy_user_page() if this bit is set you flush the
page from the destination instruction cache.

The flag is set in do_parisc_fault() if the fault is for a write
and:

	((vma->vm_flags & VM_EXEC) != 0 &&
	 vma->vm_file != NULL)

See the TIF_BLKCOMMIT logic in arch/sparc64/mm/fault.c

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [parisc-linux] Re: RFC: mmap patch
  2003-03-09 14:42         ` Matthew Wilcox
@ 2003-03-09 21:38           ` David S. Miller
  2003-03-10  1:50             ` Matthew Wilcox
  2003-03-14 13:04           ` Jochen Friedrich
  1 sibling, 1 reply; 27+ messages in thread
From: David S. Miller @ 2003-03-09 21:38 UTC (permalink / raw)
  To: willy; +Cc: grundler, jsm, parisc-linux

   From: Matthew Wilcox <willy@debian.org>
   Date: Sun, 9 Mar 2003 14:42:25 +0000
   
   Well.. one patch which affected us, but wouldn't affect a writethrough
   cache is:
   
   http://ftp.linux.org.uk/pub/linux/willy/patches/applied/shmem.diff
   
   In that situation, the user has no view of that page currently so they
   will read from memory.  With a writethrough cache it works without the
   flush_dcache_page(), with writeback it does not.
   
I think the real issue is SHMLBA isn't being honored, right?

The whole purpose of SHMLBA is to ensure a particular alignment
for shared memory anonymous regions, if that isn't happening anymore
that is a bug.

This is why SHMLBA is set to the virtual cache size on sparc32 for
certain cpus and on sparc64 for all cpus.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [parisc-linux] Re: RFC: mmap patch
  2003-03-09 21:38           ` David S. Miller
@ 2003-03-10  1:50             ` Matthew Wilcox
  2003-03-10  5:18               ` David S. Miller
  0 siblings, 1 reply; 27+ messages in thread
From: Matthew Wilcox @ 2003-03-10  1:50 UTC (permalink / raw)
  To: David S. Miller; +Cc: willy, grundler, jsm, parisc-linux

On Sun, Mar 09, 2003 at 01:38:30PM -0800, David S. Miller wrote:
> I think the real issue is SHMLBA isn't being honored, right?

I don't think so.. according to cachetlb.txt:

If your D-cache has this problem, first define asm/shmparam.h SHMLBA
properly, it should essentially be the size of your virtually
addressed D-cache (or if the size is variable, the largest possible
size).  This setting will force the SYSv IPC layer to only allow user
processes to mmap shared memory at address which are a multiple of
this value.

That isn't where the problem lies.  This missing flush is in
shmem_getpage(), round about line 906 in 2.5.64.  Really, this should
be doing a clear_user_page() -- but not all callers of this routine have
a user address to pass on.

> The whole purpose of SHMLBA is to ensure a particular alignment
> for shared memory anonymous regions, if that isn't happening anymore
> that is a bug.

I thought this was to deal with user-user coherency problems, not
user-kernel?

-- 
"It's not Hollywood.  War is real, war is primarily not about defeat or
victory, it is about death.  I've seen thousands and thousands of dead bodies.
Do you think I want to have an academic debate on this subject?" -- Robert Fisk

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [parisc-linux] Re: RFC: mmap patch
  2003-03-10  1:50             ` Matthew Wilcox
@ 2003-03-10  5:18               ` David S. Miller
  0 siblings, 0 replies; 27+ messages in thread
From: David S. Miller @ 2003-03-10  5:18 UTC (permalink / raw)
  To: willy; +Cc: grundler, jsm, parisc-linux

   From: Matthew Wilcox <willy@debian.org>
   Date: Mon, 10 Mar 2003 01:50:05 +0000
   
   Really, this should be doing a clear_user_page() -- but not all
   callers of this routine have a user address to pass on.

That sounds like the best fix to me.

The user address really is a "hint" to the routine, it need not be
precise.  Feel free to make this explicit in cachetlb.txt

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [parisc-linux] Re: RFC: mmap patch
  2003-03-09 14:42         ` Matthew Wilcox
  2003-03-09 21:38           ` David S. Miller
@ 2003-03-14 13:04           ` Jochen Friedrich
  2003-03-14 16:23             ` Grant Grundler
  1 sibling, 1 reply; 27+ messages in thread
From: Jochen Friedrich @ 2003-03-14 13:04 UTC (permalink / raw)
  To: HP900 PARISC mailing list

Hi,

since the patch was included in CVS, several modules fail to load with
unresolved external __flush_dcache_page.

This should fix it:

# cvs diff -u parisc_ksyms.c
Index: parisc_ksyms.c
===================================================================
RCS file: /var/cvs/linux/arch/parisc/kernel/parisc_ksyms.c,v
retrieving revision 1.44
diff -u -r1.44 parisc_ksyms.c
--- parisc_ksyms.c      11 Jan 2003 20:10:46 -0000      1.44
+++ parisc_ksyms.c      14 Mar 2003 12:49:49 -0000
@@ -127,6 +127,7 @@
 #include <asm/cache.h>
 EXPORT_SYMBOL(flush_kernel_dcache_range_asm);
 EXPORT_SYMBOL(flush_kernel_dcache_page);
+EXPORT_SYMBOL(__flush_dcache_page);
 EXPORT_SYMBOL(flush_all_caches);

 #include <asm/unistd.h>

--jochen

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [parisc-linux] Re: RFC: mmap patch
  2003-03-14 13:04           ` Jochen Friedrich
@ 2003-03-14 16:23             ` Grant Grundler
  0 siblings, 0 replies; 27+ messages in thread
From: Grant Grundler @ 2003-03-14 16:23 UTC (permalink / raw)
  To: Jochen Friedrich; +Cc: HP900 PARISC mailing list

On Fri, Mar 14, 2003 at 02:04:17PM +0100, Jochen Friedrich wrote:
> since the patch was included in CVS, several modules fail to load with
> unresolved external __flush_dcache_page.

yes...I just ran into the same problem when playing around with
the latest qlogic (qla2xxx v6.0.4) driver.
Appended is a more complete one which I'm not quite done with
yet either. Hopefully I can build all the flavors and commit this weekend.

grant

Index: arch/parisc/kernel/parisc_ksyms.c
===================================================================
RCS file: /var/cvs/linux/arch/parisc/kernel/parisc_ksyms.c,v
retrieving revision 1.44
diff -u -p -r1.44 parisc_ksyms.c
--- arch/parisc/kernel/parisc_ksyms.c   11 Jan 2003 20:10:46 -0000
1.44
+++ arch/parisc/kernel/parisc_ksyms.c   14 Mar 2003 16:16:47 -0000
@@ -127,7 +127,15 @@ EXPORT_SYMBOL(outsl);
 #include <asm/cache.h>
 EXPORT_SYMBOL(flush_kernel_dcache_range_asm);
 EXPORT_SYMBOL(flush_kernel_dcache_page);
-EXPORT_SYMBOL(flush_all_caches);
+
+/* asm/pgalloc.h doesn't include all it's dependencies */
+extern void __flush_dcache_page(struct page *page);
+EXPORT_SYMBOL(__flush_dcache_page);
+
+#ifndef CONFIG_SMP
+extern void flush_cache_all_local(void);
+EXPORT_SYMBOL(flush_cache_all_local);
+#endif
 
 #include <asm/unistd.h>
 extern long sys_open(const char *, int, int);

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2003-03-14 16:23 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-03-06 14:14 [parisc-linux] Re: RFC: mmap patch John Marvin
2003-03-06 14:31 ` Matthew Wilcox
2003-03-06 15:31   ` Randolph Chung
2003-03-08  6:30 ` Grant Grundler
2003-03-08  6:29   ` David S. Miller
2003-03-08 17:24     ` Grant Grundler
2003-03-08 19:04       ` David S. Miller
2003-03-08 20:42         ` Grant Grundler
2003-03-08 22:45         ` Matthew Wilcox
2003-03-08 23:00           ` David S. Miller
2003-03-08 23:27             ` Matthew Wilcox
2003-03-08 23:14               ` David S. Miller
2003-03-08 23:31             ` Randolph Chung
2003-03-08 23:15               ` David S. Miller
2003-03-09  2:15             ` Grant Grundler
2003-03-08 23:11     ` Matthew Wilcox
2003-03-08 23:02       ` David S. Miller
2003-03-09 14:42         ` Matthew Wilcox
2003-03-09 21:38           ` David S. Miller
2003-03-10  1:50             ` Matthew Wilcox
2003-03-10  5:18               ` David S. Miller
2003-03-14 13:04           ` Jochen Friedrich
2003-03-14 16:23             ` Grant Grundler
  -- strict thread matches above, loose matches on Subject: below --
2003-03-09  3:42 John Marvin
2003-03-09 21:29 ` David S. Miller
2003-03-09  3:51 John Marvin
2003-03-09 21:31 ` David S. Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox