* Re: Instability / caching problems on Qube 2 - solved ? [not found] <20031214162605.GA18357@skeleton-jack> @ 2003-12-15 2:27 ` Ralf Baechle 2003-12-15 8:32 ` Peter Horton 0 siblings, 1 reply; 4+ messages in thread From: Ralf Baechle @ 2003-12-15 2:27 UTC (permalink / raw) To: Peter Horton; +Cc: linux-mips On Sun, Dec 14, 2003 at 04:26:05PM +0000, Peter Horton wrote: > When mapping an executable image into user space the kernel reads data > into the page cache and then maps the page into user space. For an > executable page no copy is done as the mapping is read only. Correct. The kernel may also share writable pages until they're actually written to. This is called copy-on-write (COW). But executable pages usually aren't COW so this case isn't meaningful for us. > On my Qube > the acting of reading data from the IDE via PIO causes the data to be > placed in the D-cache (the RM52xx cache does write allocate), but the > page never gets flushed to physical memory and so suffers from cache > aliasing problems when it's mapped into user space. > > By enabling DMA on the IDE interface (it's off in the default Cobalt > config) the kernel suddenly becomes stable (the page in the page cache > never gets pulled into the D-cache). > > This seems to be a generic kernel problem - all architectures with VI > caches and write allocate policies could trigger it. Now that's where I'm getting some doubts about your explanation. Assume we're paging in a page that isn't mapped yet: In this case do_no_page() will load the page. Any DMA cache coherency issues are supposed to be handled by the driver. That means for an executable page all that's missing is ensuring the I-cache is coherent. This is done in these two lines: [...] flush_page_to_ram(new_page); flush_icache_page(vma, new_page); [...] update_mmu_cache(vma, address, entry); [...] flush_page_to_ram is (and must be!) a no-op. So the burden is entirely upto flush_icache_page and update_mmu_cache. Note flush_dcache_page never enters the picture when mapping an executable because the file has not been written to. So let's see flush_icache_page: static void r4k_flush_icache_page(struct vm_area_struct *vma, struct page *page) { /* * If there's no context yet, or the page isn't executable, no icache * flush is needed. */ if (!(vma->vm_flags & VM_EXEC)) return; All this is only about I-cache coherence. That is we do nothing at all if this isn't an executable page. /* * Tricky ... Because we don't know the virtual address we've got the * choice of either invalidating the entire primary and secondary * caches or invalidating the secondary caches also. With the subset * enforcment on R4000SC, R4400SC, R10000 and R12000 invalidating the * secondary cache will result in any entries in the primary caches * also getting invalidated which hopefully is a bit more economical. */ if (cpu_has_subset_pcaches) { unsigned long addr = (unsigned long) page_address(page); r4k_blast_scache_page(addr); return; } This section is only needed for certain processors such as the R4000SC. That is it's not of interest here either. if (!cpu_has_ic_fills_f_dc) { unsigned long addr = (unsigned long) page_address(page); r4k_blast_dcache_page(addr); } But cpu_has_ic_fills_f_dc is always zero on Nevada. Which means we're going to flush the page's kernel address from the D-cache here. /* * We're not sure of the virtual address(es) involved here, so * we have to flush the entire I-cache. */ if (cpu_has_vtag_icache) { int cpu = smp_processor_id(); if (cpu_context(cpu, vma->vm_mm) != 0) drop_mmu_context(vma->vm_mm, cpu); ... cpu_has_vtag_icache is zero on Nevada so the else case will be taken: } else r4k_blast_icache(); so we just blast away the entire I-cache. Coherency the hard way. At this point we've established I-cache coherency for executable pages. But what this was a non-executable page? Then flush_icache_page would do nothing at all - nor would update_mmu_cache. The page will be copied to userspace and ... whoops, data may still be in the wrong cache segment, game over. This also explains a few other bugs. > So where's the correct place to put the flush_dcache_page() ? :-) > > I don't know whether the problem could affect any other IO subsystems > ... probably SCSI at least. As you describe it it doesn't seem specific to any particular kind of device - only DMA or PIO matters; and the DMA coherency thing happens to paint over the issue which must be why it wasn't discovered for so long. Ralf ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Instability / caching problems on Qube 2 - solved ? 2003-12-15 2:27 ` Instability / caching problems on Qube 2 - solved ? Ralf Baechle @ 2003-12-15 8:32 ` Peter Horton 2003-12-15 9:04 ` Dominic Sweetman 0 siblings, 1 reply; 4+ messages in thread From: Peter Horton @ 2003-12-15 8:32 UTC (permalink / raw) To: Ralf Baechle; +Cc: Peter Horton, linux-mips On Mon, Dec 15, 2003 at 03:27:17AM +0100, Ralf Baechle wrote: > On Sun, Dec 14, 2003 at 04:26:05PM +0000, Peter Horton wrote: > > > When mapping an executable image into user space the kernel reads data > > into the page cache and then maps the page into user space. For an > > executable page no copy is done as the mapping is read only. > > Correct. > > The kernel may also share writable pages until they're actually written to. > This is called copy-on-write (COW). But executable pages usually aren't > COW so this case isn't meaningful for us. > > > On my Qube > > the acting of reading data from the IDE via PIO causes the data to be > > placed in the D-cache (the RM52xx cache does write allocate), but the > > page never gets flushed to physical memory and so suffers from cache > > aliasing problems when it's mapped into user space. > > > > By enabling DMA on the IDE interface (it's off in the default Cobalt > > config) the kernel suddenly becomes stable (the page in the page cache > > never gets pulled into the D-cache). > > > > This seems to be a generic kernel problem - all architectures with VI > > caches and write allocate policies could trigger it. > > Now that's where I'm getting some doubts about your explanation. Assume > we're paging in a page that isn't mapped yet: > > In this case do_no_page() will load the page. Any DMA cache coherency > issues are supposed to be handled by the driver. That means for an > executable page all that's missing is ensuring the I-cache is coherent. > This is done in these two lines: > > [...] > flush_page_to_ram(new_page); > flush_icache_page(vma, new_page); > [...] > update_mmu_cache(vma, address, entry); > [...] > > flush_page_to_ram is (and must be!) a no-op. So the burden is entirely > upto flush_icache_page and update_mmu_cache. Note flush_dcache_page > never enters the picture when mapping an executable because the file has > not been written to. So let's see flush_icache_page: > > static void r4k_flush_icache_page(struct vm_area_struct *vma, > struct page *page) > { > /* > * If there's no context yet, or the page isn't executable, no icache > * flush is needed. > */ > if (!(vma->vm_flags & VM_EXEC)) > return; > > All this is only about I-cache coherence. That is we do nothing at all if > this isn't an executable page. > > /* > * Tricky ... Because we don't know the virtual address we've got the > * choice of either invalidating the entire primary and secondary > * caches or invalidating the secondary caches also. With the subset > * enforcment on R4000SC, R4400SC, R10000 and R12000 invalidating the > * secondary cache will result in any entries in the primary caches > * also getting invalidated which hopefully is a bit more economical. > */ > if (cpu_has_subset_pcaches) { > unsigned long addr = (unsigned long) page_address(page); > r4k_blast_scache_page(addr); > > return; > } > > This section is only needed for certain processors such as the R4000SC. > That is it's not of interest here either. > > if (!cpu_has_ic_fills_f_dc) { > unsigned long addr = (unsigned long) page_address(page); > r4k_blast_dcache_page(addr); > } > > But cpu_has_ic_fills_f_dc is always zero on Nevada. Which means we're > going to flush the page's kernel address from the D-cache here. > > /* > * We're not sure of the virtual address(es) involved here, so > * we have to flush the entire I-cache. > */ > if (cpu_has_vtag_icache) { > int cpu = smp_processor_id(); > > if (cpu_context(cpu, vma->vm_mm) != 0) > drop_mmu_context(vma->vm_mm, cpu); > > ... cpu_has_vtag_icache is zero on Nevada so the else case will be taken: > } else > r4k_blast_icache(); > > so we just blast away the entire I-cache. Coherency the hard way. At > this point we've established I-cache coherency for executable pages. > > But what this was a non-executable page? Then flush_icache_page would do > nothing at all - nor would update_mmu_cache. The page will be copied to > userspace and ... whoops, data may still be in the wrong cache segment, > game over. This also explains a few other bugs. > I could see the aliases at the end of do_no_page() (using memcmp()) and from the code knew they had to be read only so I just assumed they were executable pages. I missed the fact that flush_icache_page() flushed the D-cache page. So like you say it must be non-executable read only pages that cause the problem. > > So where's the correct place to put the flush_dcache_page() ? :-) > > > > I don't know whether the problem could affect any other IO subsystems > > ... probably SCSI at least. > > As you describe it it doesn't seem specific to any particular kind of > device - only DMA or PIO matters; and the DMA coherency thing happens to > paint over the issue which must be why it wasn't discovered for so long. > So how do we fix it ? Flushing the page really needs to be done just after the IO into the page cache is complete so we only do it once per page cache page ? P. ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Instability / caching problems on Qube 2 - solved ? 2003-12-15 8:32 ` Peter Horton @ 2003-12-15 9:04 ` Dominic Sweetman 2003-12-15 19:51 ` Ralf Baechle 0 siblings, 1 reply; 4+ messages in thread From: Dominic Sweetman @ 2003-12-15 9:04 UTC (permalink / raw) To: Peter Horton; +Cc: Ralf Baechle, linux-mips My prejudices are showing but... o Shouldn't the kernel should have a zero-tolerance policy towards cache aliases? That is, no D-cache alias should ever be permitted to happen, not even in data you reasonably hope might be read-only? Aliases only appeared by a kind of mistake when the R4000 was opportunistically repackaged without the secondary cache (the L2 cache tags used to keep track of the virtually-indexed L1s, and you got an exception if you created an L1-alias). They really aren't a feature to be tolerated in the hope you can clean up before disaster strikes. o And I could never get my brains round cache maintenance if I used the same word ("flush") both for invalidate and write-back. -- Dominic Sweetman MIPS Technologies. ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Instability / caching problems on Qube 2 - solved ? 2003-12-15 9:04 ` Dominic Sweetman @ 2003-12-15 19:51 ` Ralf Baechle 0 siblings, 0 replies; 4+ messages in thread From: Ralf Baechle @ 2003-12-15 19:51 UTC (permalink / raw) To: Dominic Sweetman; +Cc: Peter Horton, linux-mips On Mon, Dec 15, 2003 at 09:04:49AM +0000, Dominic Sweetman wrote: > My prejudices are showing but... > > o Shouldn't the kernel should have a zero-tolerance policy towards cache > aliases? That is, no D-cache alias should ever be permitted to > happen, not even in data you reasonably hope might be read-only? We're already trying hard to avoid such aliases. The case found by Peter is clearly a bug and nothing else. > Aliases only appeared by a kind of mistake when the R4000 was > opportunistically repackaged without the secondary cache (the L2 > cache tags used to keep track of the virtually-indexed L1s, and you > got an exception if you created an L1-alias). > > They really aren't a feature to be tolerated in the hope you can > clean up before disaster strikes. These days R4000SC is only an ancient processor - but very valuable for Linux maintenance because it's virtual coherency exception is the only available hardware detector for aliases. > o And I could never get my brains round cache maintenance if I used > the same word ("flush") both for invalidate and write-back. I once had a discussion about the terminology with maintainers of other architectures. Turned none of the MIPS terms were really unambigious; does flush imply a writeback, does it imply invalidation? Does invalidate imply writing back to memory or writeback imply invalidation etc. etc. ad infinitum. Confusion pure ... Ralf ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2003-12-15 19:51 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20031214162605.GA18357@skeleton-jack>
2003-12-15 2:27 ` Instability / caching problems on Qube 2 - solved ? Ralf Baechle
2003-12-15 8:32 ` Peter Horton
2003-12-15 9:04 ` Dominic Sweetman
2003-12-15 19:51 ` Ralf Baechle
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox