Linux MIPS Architecture development
 help / color / mirror / Atom feed
* [PATCH] MIPS: copy_to_user_page: Avoid ptrace(2) I-cache incoherency
@ 2013-11-07 18:35 Maciej W. Rozycki
  2013-11-07 18:35 ` Maciej W. Rozycki
  2014-03-21 10:07 ` Ralf Baechle
  0 siblings, 2 replies; 6+ messages in thread
From: Maciej W. Rozycki @ 2013-11-07 18:35 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: linux-mips

We currently support no MIPS processor that has its I-cache coherent with 
the D-cache, no such processor may even exist.  We apparently have two 
configurations that have fully coherent D-caches and therefore set 
cpu_has_ic_fills_f_dc, and these are the Alchemy and NetLogic processor 
families.  I have checked relevant CPU documentation I was able to track 
down and in both cases the respective documents[1][2] clearly state that 
the I-cache provides no hardware coherency and whenever instructions in 
memory are modified then the I-cache has to be synchronized by software 
even though the D-caches are fully coherent.

Therefore we cannot ever avoid the call to flush_cache_page in 
copy_to_user_page and here is a change that reflects this observation.  
The implementation of flush_cache_page may then choose freely whether it 
needs to perform a full cache synchronization with D-cache writeback and 
invalidation requests or whether a lone I-cache invalidation will suffice.  
The c-r4k.c implementation already respects the setting of 
cpu_has_ic_fills_f_dc and avoids touching the D-cache unless necessary.

The lack of I-cache synchronization is typically seen in debugging 
sessions e.g. with GDB where software breakpoints are used.  When such a 
breakpoint is hit and subsequently replaced using a ptrace(2) call with 
the original instruction, the BREAK instruction previously executed 
sometimes remains in the I-cache and causes the breakpoint just removed to 
hit again regardless, resulting in a spurious SIGTRAP signal delivery that 
debuggers typically complain about (e.g. "Program received signal SIGTRAP, 
Trace/breakpoint trap" in the case of GDB).  Of course the I-cache line 
containing the BREAK instruction may have since been randomly replaced, in 
which case no problem occurs.

[1] "AMD Alchemy Au1200 Processor Data Book", AMD Alchemy, January, 2005, 
    Publication ID: 32798A

[2] "XLP Processor Family Programming Reference Manual", NetLogic 
    Microsystems, Revision Level 1.10, February, 2011, Document Number 
    10724V110PM-CR (regrettably not publicly available)

Signed-off-by: Maciej W. Rozycki <macro@codesourcery.com>
---
Ralf,

 Please apply.  I've seen these SIGTRAPs in some NetLogic GDB testing and 
the removal of this cpu_has_ic_fills_f_dc condition from copy_to_user_page 
is really necessary; also the Au1200 document is very explicit about the 
requirement of I-cache invalidation in software (see Section 2.3.7.3 
"Instruction Cache Coherency").

  Maciej

linux-mips-exec-cache-sync.diff
diff --git a/arch/mips/mm/init.c b/arch/mips/mm/init.c
index 3e0eb5f..1251d86 100644
--- a/arch/mips/mm/init.c
+++ b/arch/mips/mm/init.c
@@ -239,7 +239,7 @@ void copy_to_user_page(struct vm_area_struct *vma,
 		if (cpu_has_dc_aliases)
 			SetPageDcacheDirty(page);
 	}
-	if ((vma->vm_flags & VM_EXEC) && !cpu_has_ic_fills_f_dc)
+	if (vma->vm_flags & VM_EXEC)
 		flush_cache_page(vma, vaddr, page_to_pfn(page));
 }
 

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH] MIPS: copy_to_user_page: Avoid ptrace(2) I-cache incoherency
  2013-11-07 18:35 [PATCH] MIPS: copy_to_user_page: Avoid ptrace(2) I-cache incoherency Maciej W. Rozycki
@ 2013-11-07 18:35 ` Maciej W. Rozycki
  2014-03-21 10:07 ` Ralf Baechle
  1 sibling, 0 replies; 6+ messages in thread
From: Maciej W. Rozycki @ 2013-11-07 18:35 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: linux-mips

We currently support no MIPS processor that has its I-cache coherent with 
the D-cache, no such processor may even exist.  We apparently have two 
configurations that have fully coherent D-caches and therefore set 
cpu_has_ic_fills_f_dc, and these are the Alchemy and NetLogic processor 
families.  I have checked relevant CPU documentation I was able to track 
down and in both cases the respective documents[1][2] clearly state that 
the I-cache provides no hardware coherency and whenever instructions in 
memory are modified then the I-cache has to be synchronized by software 
even though the D-caches are fully coherent.

Therefore we cannot ever avoid the call to flush_cache_page in 
copy_to_user_page and here is a change that reflects this observation.  
The implementation of flush_cache_page may then choose freely whether it 
needs to perform a full cache synchronization with D-cache writeback and 
invalidation requests or whether a lone I-cache invalidation will suffice.  
The c-r4k.c implementation already respects the setting of 
cpu_has_ic_fills_f_dc and avoids touching the D-cache unless necessary.

The lack of I-cache synchronization is typically seen in debugging 
sessions e.g. with GDB where software breakpoints are used.  When such a 
breakpoint is hit and subsequently replaced using a ptrace(2) call with 
the original instruction, the BREAK instruction previously executed 
sometimes remains in the I-cache and causes the breakpoint just removed to 
hit again regardless, resulting in a spurious SIGTRAP signal delivery that 
debuggers typically complain about (e.g. "Program received signal SIGTRAP, 
Trace/breakpoint trap" in the case of GDB).  Of course the I-cache line 
containing the BREAK instruction may have since been randomly replaced, in 
which case no problem occurs.

[1] "AMD Alchemy Au1200 Processor Data Book", AMD Alchemy, January, 2005, 
    Publication ID: 32798A

[2] "XLP Processor Family Programming Reference Manual", NetLogic 
    Microsystems, Revision Level 1.10, February, 2011, Document Number 
    10724V110PM-CR (regrettably not publicly available)

Signed-off-by: Maciej W. Rozycki <macro@codesourcery.com>
---
Ralf,

 Please apply.  I've seen these SIGTRAPs in some NetLogic GDB testing and 
the removal of this cpu_has_ic_fills_f_dc condition from copy_to_user_page 
is really necessary; also the Au1200 document is very explicit about the 
requirement of I-cache invalidation in software (see Section 2.3.7.3 
"Instruction Cache Coherency").

  Maciej

linux-mips-exec-cache-sync.diff
diff --git a/arch/mips/mm/init.c b/arch/mips/mm/init.c
index 3e0eb5f..1251d86 100644
--- a/arch/mips/mm/init.c
+++ b/arch/mips/mm/init.c
@@ -239,7 +239,7 @@ void copy_to_user_page(struct vm_area_struct *vma,
 		if (cpu_has_dc_aliases)
 			SetPageDcacheDirty(page);
 	}
-	if ((vma->vm_flags & VM_EXEC) && !cpu_has_ic_fills_f_dc)
+	if (vma->vm_flags & VM_EXEC)
 		flush_cache_page(vma, vaddr, page_to_pfn(page));
 }
 

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] MIPS: copy_to_user_page: Avoid ptrace(2) I-cache incoherency
  2013-11-07 18:35 [PATCH] MIPS: copy_to_user_page: Avoid ptrace(2) I-cache incoherency Maciej W. Rozycki
  2013-11-07 18:35 ` Maciej W. Rozycki
@ 2014-03-21 10:07 ` Ralf Baechle
  2014-03-21 16:56   ` David Daney
  2014-11-13 21:50   ` Maciej W. Rozycki
  1 sibling, 2 replies; 6+ messages in thread
From: Ralf Baechle @ 2014-03-21 10:07 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: linux-mips

On Thu, Nov 07, 2013 at 06:35:54PM +0000, Maciej W. Rozycki wrote:

> We currently support no MIPS processor that has its I-cache coherent with 
> the D-cache, no such processor may even exist.  We apparently have two 
> configurations that have fully coherent D-caches and therefore set 
> cpu_has_ic_fills_f_dc, and these are the Alchemy and NetLogic processor 
> families.  I have checked relevant CPU documentation I was able to track 
> down and in both cases the respective documents[1][2] clearly state that 
> the I-cache provides no hardware coherency and whenever instructions in 
> memory are modified then the I-cache has to be synchronized by software 
> even though the D-caches are fully coherent.

No, cpu_has_ic_fills_f_dc doesn't mean the D-cache is coherent but rather
that the I-cache is refilled from the D-cache if there was a hit.  This
means there is no need to write back the D-cache to S-cache or even memory
which is saving some time.

> Therefore we cannot ever avoid the call to flush_cache_page in 
> copy_to_user_page and here is a change that reflects this observation.  
> The implementation of flush_cache_page may then choose freely whether it 
> needs to perform a full cache synchronization with D-cache writeback and 
> invalidation requests or whether a lone I-cache invalidation will suffice.  
> The c-r4k.c implementation already respects the setting of 
> cpu_has_ic_fills_f_dc and avoids touching the D-cache unless necessary.
> 
> The lack of I-cache synchronization is typically seen in debugging 
> sessions e.g. with GDB where software breakpoints are used.  When such a 
> breakpoint is hit and subsequently replaced using a ptrace(2) call with 
> the original instruction, the BREAK instruction previously executed 
> sometimes remains in the I-cache and causes the breakpoint just removed to 
> hit again regardless, resulting in a spurious SIGTRAP signal delivery that 
> debuggers typically complain about (e.g. "Program received signal SIGTRAP, 
> Trace/breakpoint trap" in the case of GDB).  Of course the I-cache line 
> containing the BREAK instruction may have since been randomly replaced, in 
> which case no problem occurs.
> 
> [1] "AMD Alchemy Au1200 Processor Data Book", AMD Alchemy, January, 2005, 
>     Publication ID: 32798A
> 
> [2] "XLP Processor Family Programming Reference Manual", NetLogic 
>     Microsystems, Revision Level 1.10, February, 2011, Document Number 
>     10724V110PM-CR (regrettably not publicly available)
> 
> Signed-off-by: Maciej W. Rozycki <macro@codesourcery.com>
> ---
> Ralf,
> 
>  Please apply.  I've seen these SIGTRAPs in some NetLogic GDB testing and 
> the removal of this cpu_has_ic_fills_f_dc condition from copy_to_user_page 
> is really necessary; also the Au1200 document is very explicit about the 
> requirement of I-cache invalidation in software (see Section 2.3.7.3 
> "Instruction Cache Coherency").

You found a bug and yes, the fix you sent improves things a bit.  But
there is also the cache on a cache coherent system where a page might
be marked for a delayed cache flush with SetPageDcacheDirty(), then
flushed by flush_cache_page() before eventually the delayed cacheflush
flushes it once more for a good meassure.

What do you think about below patch to also deal with the duplicate flushing?

  Ralf

 arch/mips/mm/init.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/mips/mm/init.c b/arch/mips/mm/init.c
index 6b59617..80072ef 100644
--- a/arch/mips/mm/init.c
+++ b/arch/mips/mm/init.c
@@ -227,13 +227,13 @@ void copy_to_user_page(struct vm_area_struct *vma,
 		void *vto = kmap_coherent(page, vaddr) + (vaddr & ~PAGE_MASK);
 		memcpy(vto, src, len);
 		kunmap_coherent();
+		if (vma->vm_flags & VM_EXEC)
+			flush_cache_page(vma, vaddr, page_to_pfn(page));
 	} else {
 		memcpy(dst, src, len);
 		if (cpu_has_dc_aliases)
 			SetPageDcacheDirty(page);
 	}
-	if ((vma->vm_flags & VM_EXEC) && !cpu_has_ic_fills_f_dc)
-		flush_cache_page(vma, vaddr, page_to_pfn(page));
 }
 
 void copy_from_user_page(struct vm_area_struct *vma,

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] MIPS: copy_to_user_page: Avoid ptrace(2) I-cache incoherency
  2014-03-21 10:07 ` Ralf Baechle
@ 2014-03-21 16:56   ` David Daney
  2014-11-13 21:50   ` Maciej W. Rozycki
  1 sibling, 0 replies; 6+ messages in thread
From: David Daney @ 2014-03-21 16:56 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: Maciej W. Rozycki, linux-mips

On 03/21/2014 03:07 AM, Ralf Baechle wrote:
> On Thu, Nov 07, 2013 at 06:35:54PM +0000, Maciej W. Rozycki wrote:
>
>> We currently support no MIPS processor that has its I-cache coherent with
>> the D-cache, no such processor may even exist.  We apparently have two
>> configurations that have fully coherent D-caches and therefore set
>> cpu_has_ic_fills_f_dc, and these are the Alchemy and NetLogic processor
>> families.  I have checked relevant CPU documentation I was able to track
>> down and in both cases the respective documents[1][2] clearly state that
>> the I-cache provides no hardware coherency and whenever instructions in
>> memory are modified then the I-cache has to be synchronized by software
>> even though the D-caches are fully coherent.
>
> No, cpu_has_ic_fills_f_dc doesn't mean the D-cache is coherent but rather
> that the I-cache is refilled from the D-cache if there was a hit.  This
> means there is no need to write back the D-cache to S-cache or even memory
> which is saving some time.
>
>> Therefore we cannot ever avoid the call to flush_cache_page in
>> copy_to_user_page and here is a change that reflects this observation.
>> The implementation of flush_cache_page may then choose freely whether it
>> needs to perform a full cache synchronization with D-cache writeback and
>> invalidation requests or whether a lone I-cache invalidation will suffice.
>> The c-r4k.c implementation already respects the setting of
>> cpu_has_ic_fills_f_dc and avoids touching the D-cache unless necessary.
>>
>> The lack of I-cache synchronization is typically seen in debugging
>> sessions e.g. with GDB where software breakpoints are used.  When such a
>> breakpoint is hit and subsequently replaced using a ptrace(2) call with
>> the original instruction, the BREAK instruction previously executed
>> sometimes remains in the I-cache and causes the breakpoint just removed to
>> hit again regardless, resulting in a spurious SIGTRAP signal delivery that
>> debuggers typically complain about (e.g. "Program received signal SIGTRAP,
>> Trace/breakpoint trap" in the case of GDB).  Of course the I-cache line
>> containing the BREAK instruction may have since been randomly replaced, in
>> which case no problem occurs.
>>
>> [1] "AMD Alchemy Au1200 Processor Data Book", AMD Alchemy, January, 2005,
>>      Publication ID: 32798A
>>
>> [2] "XLP Processor Family Programming Reference Manual", NetLogic
>>      Microsystems, Revision Level 1.10, February, 2011, Document Number
>>      10724V110PM-CR (regrettably not publicly available)
>>
>> Signed-off-by: Maciej W. Rozycki <macro@codesourcery.com>
>> ---
>> Ralf,
>>
>>   Please apply.  I've seen these SIGTRAPs in some NetLogic GDB testing and
>> the removal of this cpu_has_ic_fills_f_dc condition from copy_to_user_page
>> is really necessary; also the Au1200 document is very explicit about the
>> requirement of I-cache invalidation in software (see Section 2.3.7.3
>> "Instruction Cache Coherency").
>
> You found a bug and yes, the fix you sent improves things a bit.  But
> there is also the cache on a cache coherent system where a page might
> be marked for a delayed cache flush with SetPageDcacheDirty(), then
> flushed by flush_cache_page() before eventually the delayed cacheflush
> flushes it once more for a good meassure.
>
> What do you think about below patch to also deal with the duplicate flushing?
>


The problem only happens when modifying target executable code through 
the ptrace() system call.

For all cases where a program is modifying its own executable memory, we 
require that it make the special mips cacheflush system call.

I don't object to modifying this file, but I wonder if the call to the 
flushing function should be pushed up into the ptrace() code.

David Daney


>    Ralf
>
>   arch/mips/mm/init.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/mips/mm/init.c b/arch/mips/mm/init.c
> index 6b59617..80072ef 100644
> --- a/arch/mips/mm/init.c
> +++ b/arch/mips/mm/init.c
> @@ -227,13 +227,13 @@ void copy_to_user_page(struct vm_area_struct *vma,
>   		void *vto = kmap_coherent(page, vaddr) + (vaddr & ~PAGE_MASK);
>   		memcpy(vto, src, len);
>   		kunmap_coherent();
> +		if (vma->vm_flags & VM_EXEC)
> +			flush_cache_page(vma, vaddr, page_to_pfn(page));
>   	} else {
>   		memcpy(dst, src, len);
>   		if (cpu_has_dc_aliases)
>   			SetPageDcacheDirty(page);
>   	}
> -	if ((vma->vm_flags & VM_EXEC) && !cpu_has_ic_fills_f_dc)
> -		flush_cache_page(vma, vaddr, page_to_pfn(page));
>   }
>
>   void copy_from_user_page(struct vm_area_struct *vma,
>
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] MIPS: copy_to_user_page: Avoid ptrace(2) I-cache incoherency
  2014-03-21 10:07 ` Ralf Baechle
  2014-03-21 16:56   ` David Daney
@ 2014-11-13 21:50   ` Maciej W. Rozycki
  2014-11-13 21:50     ` Maciej W. Rozycki
  1 sibling, 1 reply; 6+ messages in thread
From: Maciej W. Rozycki @ 2014-11-13 21:50 UTC (permalink / raw)
  To: Ralf Baechle, David Daney; +Cc: linux-mips

On Fri, 21 Mar 2014, Ralf Baechle wrote:

> >  Please apply.  I've seen these SIGTRAPs in some NetLogic GDB testing and 
> > the removal of this cpu_has_ic_fills_f_dc condition from copy_to_user_page 
> > is really necessary; also the Au1200 document is very explicit about the 
> > requirement of I-cache invalidation in software (see Section 2.3.7.3 
> > "Instruction Cache Coherency").
> 
> You found a bug and yes, the fix you sent improves things a bit.  But
> there is also the cache on a cache coherent system where a page might
> be marked for a delayed cache flush with SetPageDcacheDirty(), then
> flushed by flush_cache_page() before eventually the delayed cacheflush
> flushes it once more for a good meassure.
> 
> What do you think about below patch to also deal with the duplicate flushing?
> 
>   Ralf
> 
>  arch/mips/mm/init.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/mips/mm/init.c b/arch/mips/mm/init.c
> index 6b59617..80072ef 100644
> --- a/arch/mips/mm/init.c
> +++ b/arch/mips/mm/init.c
> @@ -227,13 +227,13 @@ void copy_to_user_page(struct vm_area_struct *vma,
>  		void *vto = kmap_coherent(page, vaddr) + (vaddr & ~PAGE_MASK);
>  		memcpy(vto, src, len);
>  		kunmap_coherent();
> +		if (vma->vm_flags & VM_EXEC)
> +			flush_cache_page(vma, vaddr, page_to_pfn(page));
>  	} else {
>  		memcpy(dst, src, len);
>  		if (cpu_has_dc_aliases)
>  			SetPageDcacheDirty(page);
>  	}
> -	if ((vma->vm_flags & VM_EXEC) && !cpu_has_ic_fills_f_dc)
> -		flush_cache_page(vma, vaddr, page_to_pfn(page));
>  }
>  
>  void copy_from_user_page(struct vm_area_struct *vma,

 This clearly won't work, because `cpu_has_dc_aliases' is unset for 
NetLogic processors and therefore the second leg of the conditional you 
propose to patch is taken, whereas your change applies to the first leg.

 So if you want to tackle the case of SetPageDcacheDirty, then here it 
has to be something along the lines of:

{
	int delayed_cache_flush = 0

        if (cpu_has_dc_aliases &&
	    page_mapped(page) && !Page_dcache_dirty(page)) {
		void *vto = kmap_coherent(page, vaddr) + (vaddr & ~PAGE_MASK);
		memcpy(vto, src, len);
		kunmap_coherent();
	} else {
		memcpy(dst, src, len);
		if (cpu_has_dc_aliases) {
			delayed_cache_flush = 1;
			SetPageDcacheDirty(page);
		}
	}
	if (!delayed_cache_flush && (vma->vm_flags & VM_EXEC))
		flush_cache_page(vma, vaddr, page_to_pfn(page));
}

and then all the places where the delayed flush is made (a couple, 
according to Page_dcache_dirty() references) updated accordingly to 
synchronise the I-cache too.

 Although it seems to me we may have no easy way to access VM flags 
there, so perhaps another page flag to mark the required I-cache flush 
too?  Like:

	if ((vma->vm_flags & VM_EXEC)) {
		if (delayed_cache_flush)
			SetPageIcacheStale(page);
		else
			flush_cache_page(vma, vaddr, page_to_pfn(page));
	}

?  WDYT?

On Fri, 21 Mar 2014, David Daney wrote:

> The problem only happens when modifying target executable code through the
> ptrace() system call.
> 
> For all cases where a program is modifying its own executable memory, 
> we require that it make the special mips cacheflush system call.
> 
> I don't object to modifying this file, but I wonder if the call to the 
> flushing function should be pushed up into the ptrace() code.

 Hmm, good point, although it looks to me a lot of __access_remote_vm() 
code would have to be carried over or maybe factored out somehow from 
mm/memory.c to arch/mips/kernel/ptrace.c.  So although it sounds to me 
like a reasonable idea overall, it also appears to me it would best be 
done as a separate project rather than a part of a fix for this specific 
bug.

 Especially as overall what we do here is an extreme overkill.  Call to 
ptrace(PTRACE_POKETEXT, ...) are typically made to patch up single 
instructions so at most two cache lines need to be sychronised whereas 
we use a sledgehammer and kill a whole page worth of cache data -- 
always painful and even more painful if you go up to 16kB let alone 64kB 
with the page size.  Then from the MIPS32r2 ISA onwards we have the 
SYNCI instruction that could be used instead that would save even more.

  Maciej

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] MIPS: copy_to_user_page: Avoid ptrace(2) I-cache incoherency
  2014-11-13 21:50   ` Maciej W. Rozycki
@ 2014-11-13 21:50     ` Maciej W. Rozycki
  0 siblings, 0 replies; 6+ messages in thread
From: Maciej W. Rozycki @ 2014-11-13 21:50 UTC (permalink / raw)
  To: Ralf Baechle, David Daney; +Cc: linux-mips

On Fri, 21 Mar 2014, Ralf Baechle wrote:

> >  Please apply.  I've seen these SIGTRAPs in some NetLogic GDB testing and 
> > the removal of this cpu_has_ic_fills_f_dc condition from copy_to_user_page 
> > is really necessary; also the Au1200 document is very explicit about the 
> > requirement of I-cache invalidation in software (see Section 2.3.7.3 
> > "Instruction Cache Coherency").
> 
> You found a bug and yes, the fix you sent improves things a bit.  But
> there is also the cache on a cache coherent system where a page might
> be marked for a delayed cache flush with SetPageDcacheDirty(), then
> flushed by flush_cache_page() before eventually the delayed cacheflush
> flushes it once more for a good meassure.
> 
> What do you think about below patch to also deal with the duplicate flushing?
> 
>   Ralf
> 
>  arch/mips/mm/init.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/mips/mm/init.c b/arch/mips/mm/init.c
> index 6b59617..80072ef 100644
> --- a/arch/mips/mm/init.c
> +++ b/arch/mips/mm/init.c
> @@ -227,13 +227,13 @@ void copy_to_user_page(struct vm_area_struct *vma,
>  		void *vto = kmap_coherent(page, vaddr) + (vaddr & ~PAGE_MASK);
>  		memcpy(vto, src, len);
>  		kunmap_coherent();
> +		if (vma->vm_flags & VM_EXEC)
> +			flush_cache_page(vma, vaddr, page_to_pfn(page));
>  	} else {
>  		memcpy(dst, src, len);
>  		if (cpu_has_dc_aliases)
>  			SetPageDcacheDirty(page);
>  	}
> -	if ((vma->vm_flags & VM_EXEC) && !cpu_has_ic_fills_f_dc)
> -		flush_cache_page(vma, vaddr, page_to_pfn(page));
>  }
>  
>  void copy_from_user_page(struct vm_area_struct *vma,

 This clearly won't work, because `cpu_has_dc_aliases' is unset for 
NetLogic processors and therefore the second leg of the conditional you 
propose to patch is taken, whereas your change applies to the first leg.

 So if you want to tackle the case of SetPageDcacheDirty, then here it 
has to be something along the lines of:

{
	int delayed_cache_flush = 0

        if (cpu_has_dc_aliases &&
	    page_mapped(page) && !Page_dcache_dirty(page)) {
		void *vto = kmap_coherent(page, vaddr) + (vaddr & ~PAGE_MASK);
		memcpy(vto, src, len);
		kunmap_coherent();
	} else {
		memcpy(dst, src, len);
		if (cpu_has_dc_aliases) {
			delayed_cache_flush = 1;
			SetPageDcacheDirty(page);
		}
	}
	if (!delayed_cache_flush && (vma->vm_flags & VM_EXEC))
		flush_cache_page(vma, vaddr, page_to_pfn(page));
}

and then all the places where the delayed flush is made (a couple, 
according to Page_dcache_dirty() references) updated accordingly to 
synchronise the I-cache too.

 Although it seems to me we may have no easy way to access VM flags 
there, so perhaps another page flag to mark the required I-cache flush 
too?  Like:

	if ((vma->vm_flags & VM_EXEC)) {
		if (delayed_cache_flush)
			SetPageIcacheStale(page);
		else
			flush_cache_page(vma, vaddr, page_to_pfn(page));
	}

?  WDYT?

On Fri, 21 Mar 2014, David Daney wrote:

> The problem only happens when modifying target executable code through the
> ptrace() system call.
> 
> For all cases where a program is modifying its own executable memory, 
> we require that it make the special mips cacheflush system call.
> 
> I don't object to modifying this file, but I wonder if the call to the 
> flushing function should be pushed up into the ptrace() code.

 Hmm, good point, although it looks to me a lot of __access_remote_vm() 
code would have to be carried over or maybe factored out somehow from 
mm/memory.c to arch/mips/kernel/ptrace.c.  So although it sounds to me 
like a reasonable idea overall, it also appears to me it would best be 
done as a separate project rather than a part of a fix for this specific 
bug.

 Especially as overall what we do here is an extreme overkill.  Call to 
ptrace(PTRACE_POKETEXT, ...) are typically made to patch up single 
instructions so at most two cache lines need to be sychronised whereas 
we use a sledgehammer and kill a whole page worth of cache data -- 
always painful and even more painful if you go up to 16kB let alone 64kB 
with the page size.  Then from the MIPS32r2 ISA onwards we have the 
SYNCI instruction that could be used instead that would save even more.

  Maciej

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-11-13 21:51 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-11-07 18:35 [PATCH] MIPS: copy_to_user_page: Avoid ptrace(2) I-cache incoherency Maciej W. Rozycki
2013-11-07 18:35 ` Maciej W. Rozycki
2014-03-21 10:07 ` Ralf Baechle
2014-03-21 16:56   ` David Daney
2014-11-13 21:50   ` Maciej W. Rozycki
2014-11-13 21:50     ` Maciej W. Rozycki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox