linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* Unsafe pte_update() in do_page_fault() (4xx and Book-E)
@ 2006-03-02 20:26 Eugene Surovegin
  2006-03-03  3:43 ` Benjamin Herrenschmidt
  2006-03-28  7:55 ` [PATCH] lock PTE before updating it in 440/BookE page fault handler Eugene Surovegin
  0 siblings, 2 replies; 6+ messages in thread
From: Eugene Surovegin @ 2006-03-02 20:26 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: linuxppc-dev, Kumar Gala

Hi!

For the last couple of days I was debugging rare 

  swap_dup: Bad swap file entry 0x00000080

errors in my custom 2.4 kernel running on 405GPr system.

My current theory is that this error is caused by the special lazy 
dcache/icache flush handling on 4xx and BookE. Because this code in my 
2.4 was actually a backport from 2.6, I think we have a problem in 
current 2.6 as well.

Here is what I think happens. On 4xx/BookE we use execute bit to 
deffer dcache to icache flush, in do_page_fault() we flush page when 
execute trap triggers and enable _PAGE_HWEXEC bit in PTE. 

Unfortunately, we don't lock this PTE and it's possible that after 
pte_present() check but _before_ pte_update() call this particular 
page was purged from the memory, e.g. because of extreme memory 
pressure (of course, I'm assuming enabled preempt). 

If this happens, pte_update() sets _PAGE_HWEXEC bit in just cleared 
PTE. Sometime later, another page fault happens for this page, but 
because of that set bit, pte_none() test in handle_pte_fault() fails, 
and we continue along the wrong path, thinking that this PTE was 
swapped out to the swap file, and this triggers swap_dup error I 
mentioned at the beginning.

_PAGE_HWXEC is 0x200 on 405GPr, and because swap entry is PTE shifted 
2 bits to the right, we get that "0x00000080" value.

Paul, does my theory make any sense? I cannot test 2.6 on our hw. So 
far, after I added additional page_table_lock locking to my 2.4 in 
do_page_fault(), I haven't seen these errors, but it's too early to be 
100% sure :).

I'll make a patch for 2.6 if you think my analysis is correct.

-- 
Eugene

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2006-03-28 18:13 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-03-02 20:26 Unsafe pte_update() in do_page_fault() (4xx and Book-E) Eugene Surovegin
2006-03-03  3:43 ` Benjamin Herrenschmidt
2006-03-03  3:55   ` Eugene Surovegin
2006-03-28  7:55 ` [PATCH] lock PTE before updating it in 440/BookE page fault handler Eugene Surovegin
2006-03-28 11:10   ` Paul Mackerras
2006-03-28 18:13     ` Eugene Surovegin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).