Linux MIPS Architecture development
 help / color / mirror / Atom feed
* Re: Instability / caching problems on Qube 2 - solved ?
       [not found] <20031214162605.GA18357@skeleton-jack>
@ 2003-12-15  2:27 ` Ralf Baechle
  2003-12-15  8:32   ` Peter Horton
  0 siblings, 1 reply; 4+ messages in thread
From: Ralf Baechle @ 2003-12-15  2:27 UTC (permalink / raw)
  To: Peter Horton; +Cc: linux-mips

On Sun, Dec 14, 2003 at 04:26:05PM +0000, Peter Horton wrote:

> When mapping an executable image into user space the kernel reads data
> into the page cache and then maps the page into user space. For an
> executable page no copy is done as the mapping is read only.

Correct.

The kernel may also share writable pages until they're actually written to.
This is called copy-on-write (COW).  But executable pages usually aren't
COW so this case isn't meaningful for us.

> On my Qube
> the acting of reading data from the IDE via PIO causes the data to be
> placed in the D-cache (the RM52xx cache does write allocate), but the
> page never gets flushed to physical memory and so suffers from cache
> aliasing problems when it's mapped into user space.
> 
> By enabling DMA on the IDE interface (it's off in the default Cobalt
> config) the kernel suddenly becomes stable (the page in the page cache
> never gets pulled into the D-cache).
> 
> This seems to be a generic kernel problem - all architectures with VI
> caches and write allocate policies could trigger it.

Now that's where I'm getting some doubts about your explanation.  Assume
we're paging in a page that isn't mapped yet:

In this case do_no_page() will load the page.  Any DMA cache coherency
issues are supposed to be handled by the driver.  That means for an
executable page all that's missing is ensuring the I-cache is coherent.
This is done in these two lines:

[...]
                flush_page_to_ram(new_page);
                flush_icache_page(vma, new_page);
[...]
        update_mmu_cache(vma, address, entry);
[...]

   flush_page_to_ram is (and must be!) a no-op.  So the burden is entirely
   upto flush_icache_page and update_mmu_cache.  Note flush_dcache_page
   never enters the picture when mapping an executable because the file has
   not been written to.  So let's see flush_icache_page:

static void r4k_flush_icache_page(struct vm_area_struct *vma,
	struct page *page)
{
	/*
	 * If there's no context yet, or the page isn't executable, no icache
	 * flush is needed.
	 */
	if (!(vma->vm_flags & VM_EXEC))
		return;

All this is only about I-cache coherence.  That is we do nothing at all if
this isn't an executable page.

	/*
	 * Tricky ...  Because we don't know the virtual address we've got the
	 * choice of either invalidating the entire primary and secondary
	 * caches or invalidating the secondary caches also.  With the subset
	 * enforcment on R4000SC, R4400SC, R10000 and R12000 invalidating the
	 * secondary cache will result in any entries in the primary caches
	 * also getting invalidated which hopefully is a bit more economical.
	 */
	if (cpu_has_subset_pcaches) {
		unsigned long addr = (unsigned long) page_address(page);
		r4k_blast_scache_page(addr);

		return;
	}

This section is only needed for certain processors such as the R4000SC.
That is it's not of interest here either.

	if (!cpu_has_ic_fills_f_dc) {
		unsigned long addr = (unsigned long) page_address(page);
		r4k_blast_dcache_page(addr);
	}

But cpu_has_ic_fills_f_dc is always zero on Nevada.  Which means we're
going to flush the page's kernel address from the D-cache here.

	/*
	 * We're not sure of the virtual address(es) involved here, so
	 * we have to flush the entire I-cache.
	 */
	if (cpu_has_vtag_icache) {
		int cpu = smp_processor_id();

		if (cpu_context(cpu, vma->vm_mm) != 0)
			drop_mmu_context(vma->vm_mm, cpu);

... cpu_has_vtag_icache is zero on Nevada so the else case will be taken:
	} else
		r4k_blast_icache();

so we just blast away the entire I-cache.  Coherency the hard way.  At
this point we've established I-cache coherency for executable pages.

But what this was a non-executable page?  Then flush_icache_page would do
nothing at all - nor would update_mmu_cache.   The page will be copied to
userspace and ...  whoops, data may still be in the wrong cache segment,
game over.  This also explains a few other bugs.

> So where's the correct place to put the flush_dcache_page() ? :-)
> 
> I don't know whether the problem could affect any other IO subsystems
> ... probably SCSI at least.

As you describe it it doesn't seem specific to any particular kind of
device - only DMA or PIO matters; and the DMA coherency thing happens to
paint over the issue which must be why it wasn't discovered for so long.

  Ralf

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Instability / caching problems on Qube 2 - solved ?
  2003-12-15  2:27 ` Instability / caching problems on Qube 2 - solved ? Ralf Baechle
@ 2003-12-15  8:32   ` Peter Horton
  2003-12-15  9:04     ` Dominic Sweetman
  0 siblings, 1 reply; 4+ messages in thread
From: Peter Horton @ 2003-12-15  8:32 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: Peter Horton, linux-mips

On Mon, Dec 15, 2003 at 03:27:17AM +0100, Ralf Baechle wrote:
> On Sun, Dec 14, 2003 at 04:26:05PM +0000, Peter Horton wrote:
> 
> > When mapping an executable image into user space the kernel reads data
> > into the page cache and then maps the page into user space. For an
> > executable page no copy is done as the mapping is read only.
> 
> Correct.
> 
> The kernel may also share writable pages until they're actually written to.
> This is called copy-on-write (COW).  But executable pages usually aren't
> COW so this case isn't meaningful for us.
> 
> > On my Qube
> > the acting of reading data from the IDE via PIO causes the data to be
> > placed in the D-cache (the RM52xx cache does write allocate), but the
> > page never gets flushed to physical memory and so suffers from cache
> > aliasing problems when it's mapped into user space.
> > 
> > By enabling DMA on the IDE interface (it's off in the default Cobalt
> > config) the kernel suddenly becomes stable (the page in the page cache
> > never gets pulled into the D-cache).
> > 
> > This seems to be a generic kernel problem - all architectures with VI
> > caches and write allocate policies could trigger it.
> 
> Now that's where I'm getting some doubts about your explanation.  Assume
> we're paging in a page that isn't mapped yet:
> 
> In this case do_no_page() will load the page.  Any DMA cache coherency
> issues are supposed to be handled by the driver.  That means for an
> executable page all that's missing is ensuring the I-cache is coherent.
> This is done in these two lines:
> 
> [...]
>                 flush_page_to_ram(new_page);
>                 flush_icache_page(vma, new_page);
> [...]
>         update_mmu_cache(vma, address, entry);
> [...]
> 
>    flush_page_to_ram is (and must be!) a no-op.  So the burden is entirely
>    upto flush_icache_page and update_mmu_cache.  Note flush_dcache_page
>    never enters the picture when mapping an executable because the file has
>    not been written to.  So let's see flush_icache_page:
> 
> static void r4k_flush_icache_page(struct vm_area_struct *vma,
> 	struct page *page)
> {
> 	/*
> 	 * If there's no context yet, or the page isn't executable, no icache
> 	 * flush is needed.
> 	 */
> 	if (!(vma->vm_flags & VM_EXEC))
> 		return;
> 
> All this is only about I-cache coherence.  That is we do nothing at all if
> this isn't an executable page.
> 
> 	/*
> 	 * Tricky ...  Because we don't know the virtual address we've got the
> 	 * choice of either invalidating the entire primary and secondary
> 	 * caches or invalidating the secondary caches also.  With the subset
> 	 * enforcment on R4000SC, R4400SC, R10000 and R12000 invalidating the
> 	 * secondary cache will result in any entries in the primary caches
> 	 * also getting invalidated which hopefully is a bit more economical.
> 	 */
> 	if (cpu_has_subset_pcaches) {
> 		unsigned long addr = (unsigned long) page_address(page);
> 		r4k_blast_scache_page(addr);
> 
> 		return;
> 	}
> 
> This section is only needed for certain processors such as the R4000SC.
> That is it's not of interest here either.
> 
> 	if (!cpu_has_ic_fills_f_dc) {
> 		unsigned long addr = (unsigned long) page_address(page);
> 		r4k_blast_dcache_page(addr);
> 	}
> 
> But cpu_has_ic_fills_f_dc is always zero on Nevada.  Which means we're
> going to flush the page's kernel address from the D-cache here.
> 
> 	/*
> 	 * We're not sure of the virtual address(es) involved here, so
> 	 * we have to flush the entire I-cache.
> 	 */
> 	if (cpu_has_vtag_icache) {
> 		int cpu = smp_processor_id();
> 
> 		if (cpu_context(cpu, vma->vm_mm) != 0)
> 			drop_mmu_context(vma->vm_mm, cpu);
> 
> ... cpu_has_vtag_icache is zero on Nevada so the else case will be taken:
> 	} else
> 		r4k_blast_icache();
> 
> so we just blast away the entire I-cache.  Coherency the hard way.  At
> this point we've established I-cache coherency for executable pages.
> 
> But what this was a non-executable page?  Then flush_icache_page would do
> nothing at all - nor would update_mmu_cache.   The page will be copied to
> userspace and ...  whoops, data may still be in the wrong cache segment,
> game over.  This also explains a few other bugs.
> 

I could see the aliases at the end of do_no_page() (using memcmp()) and
from the code knew they had to be read only so I just assumed they were
executable pages. I missed the fact that flush_icache_page() flushed the
D-cache page. So like you say it must be non-executable read only pages
that cause the problem.

> > So where's the correct place to put the flush_dcache_page() ? :-)
> > 
> > I don't know whether the problem could affect any other IO subsystems
> > ... probably SCSI at least.
> 
> As you describe it it doesn't seem specific to any particular kind of
> device - only DMA or PIO matters; and the DMA coherency thing happens to
> paint over the issue which must be why it wasn't discovered for so long.
> 

So how do we fix it ? Flushing the page really needs to be done just
after the IO into the page cache is complete so we only do it once per
page cache page ?

P.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Instability / caching problems on Qube 2 - solved ?
  2003-12-15  8:32   ` Peter Horton
@ 2003-12-15  9:04     ` Dominic Sweetman
  2003-12-15 19:51       ` Ralf Baechle
  0 siblings, 1 reply; 4+ messages in thread
From: Dominic Sweetman @ 2003-12-15  9:04 UTC (permalink / raw)
  To: Peter Horton; +Cc: Ralf Baechle, linux-mips


My prejudices are showing but...

o Shouldn't the kernel should have a zero-tolerance policy towards cache
  aliases?  That is, no D-cache alias should ever be permitted to
  happen, not even in data you reasonably hope might be read-only?
  
  Aliases only appeared by a kind of mistake when the R4000 was
  opportunistically repackaged without the secondary cache (the L2
  cache tags used to keep track of the virtually-indexed L1s, and you
  got an exception if you created an L1-alias).

  They really aren't a feature to be tolerated in the hope you can
  clean up before disaster strikes.

o And I could never get my brains round cache maintenance if I used
  the same word ("flush") both for invalidate and write-back.

--
Dominic Sweetman
MIPS Technologies.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Instability / caching problems on Qube 2 - solved ?
  2003-12-15  9:04     ` Dominic Sweetman
@ 2003-12-15 19:51       ` Ralf Baechle
  0 siblings, 0 replies; 4+ messages in thread
From: Ralf Baechle @ 2003-12-15 19:51 UTC (permalink / raw)
  To: Dominic Sweetman; +Cc: Peter Horton, linux-mips

On Mon, Dec 15, 2003 at 09:04:49AM +0000, Dominic Sweetman wrote:

> My prejudices are showing but...
> 
> o Shouldn't the kernel should have a zero-tolerance policy towards cache
>   aliases?  That is, no D-cache alias should ever be permitted to
>   happen, not even in data you reasonably hope might be read-only?

We're already trying hard to avoid such aliases.  The case found by Peter
is clearly a bug and nothing else.

>   Aliases only appeared by a kind of mistake when the R4000 was
>   opportunistically repackaged without the secondary cache (the L2
>   cache tags used to keep track of the virtually-indexed L1s, and you
>   got an exception if you created an L1-alias).
> 
>   They really aren't a feature to be tolerated in the hope you can
>   clean up before disaster strikes.

These days R4000SC is only an ancient processor - but very valuable for
Linux maintenance because it's virtual coherency exception is the
only available hardware detector for aliases.

> o And I could never get my brains round cache maintenance if I used
>   the same word ("flush") both for invalidate and write-back.

I once had a discussion about the terminology with maintainers of other
architectures.  Turned none of the MIPS terms were really unambigious;
does flush imply a writeback, does it imply invalidation?  Does
invalidate imply writing back to memory or writeback imply invalidation
etc. etc. ad infinitum.  Confusion pure ...

  Ralf

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2003-12-15 19:51 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20031214162605.GA18357@skeleton-jack>
2003-12-15  2:27 ` Instability / caching problems on Qube 2 - solved ? Ralf Baechle
2003-12-15  8:32   ` Peter Horton
2003-12-15  9:04     ` Dominic Sweetman
2003-12-15 19:51       ` Ralf Baechle

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox