[PATCH 0/3] Optimise cache-flushing system call and add iovec variant

linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/3] Optimise cache-flushing system call and add iovec variant
@ 2013-03-25 18:18 Will Deacon
  2013-03-25 18:18 ` [PATCH 1/3] ARM: cacheflush: don't round address range up to nearest page Will Deacon
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Will Deacon @ 2013-03-25 18:18 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

This patch series has been sitting on my box for some time, pending
benchmark results and analysis.

The results are that, with the first two patches in this series, we see
~2% improvement in browser benchmark scores, which flush the cache fairly
regularly in response to jitting JS code.

The final patch is more of an RFC, since I don't have any software that
uses it outside of my toy programs.

All feedback welcome,

Will

Will Deacon (3):
  ARM: cacheflush: don't round address range up to nearest page
  ARM: cacheflush: don't bother rounding to nearest vma
  ARM: cacheflush: add new iovec-based cache flushing system call

 arch/arm/include/asm/cacheflush.h  |  3 +--
 arch/arm/include/uapi/asm/unistd.h |  1 +
 arch/arm/kernel/traps.c            | 49 +++++++++++++++++++++++++++-----------
 3 files changed, 37 insertions(+), 16 deletions(-)

-- 
1.8.0

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/3] ARM: cacheflush: don't round address range up to nearest page
  2013-03-25 18:18 [PATCH 0/3] Optimise cache-flushing system call and add iovec variant Will Deacon
@ 2013-03-25 18:18 ` Will Deacon
  2013-03-27 11:05   ` Catalin Marinas
  2013-03-25 18:18 ` [PATCH 2/3] ARM: cacheflush: don't bother rounding to nearest vma Will Deacon
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 13+ messages in thread
From: Will Deacon @ 2013-03-25 18:18 UTC (permalink / raw)
  To: linux-arm-kernel

The flush_cache_user_range macro takes a pair of addresses describing
the start and end of the virtual address range to flush. Due to an
accidental oversight when flush_cache_range_user was introduced, the
address range was rounded up so that the start and end addresses were
page-aligned.

For historical reference, the interesting commits in history.git are:

10eacf1775e1 ("[ARM] Clean up ARM cache handling interfaces (part 1)")
71432e79b76b ("[ARM] Add flush_cache_user_page() for sys_cacheflush()")

This patch removes the alignment code, reducing the amount of flushing
required for ranges that are not an exact multiple of PAGE_SIZE.

Reported-by: Jonathan Austin <jonathan.austin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 arch/arm/include/asm/cacheflush.h | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/arm/include/asm/cacheflush.h b/arch/arm/include/asm/cacheflush.h
index e1489c5..2965ad1 100644
--- a/arch/arm/include/asm/cacheflush.h
+++ b/arch/arm/include/asm/cacheflush.h
@@ -268,8 +268,7 @@ extern void flush_cache_page(struct vm_area_struct *vma, unsigned long user_addr
  * Harvard caches are synchronised for the user space address range.
  * This is used for the ARM private sys_cacheflush system call.
  */
-#define flush_cache_user_range(start,end) \
-	__cpuc_coherent_user_range((start) & PAGE_MASK, PAGE_ALIGN(end))
+#define flush_cache_user_range(s,e)	__cpuc_coherent_user_range(s,e)
 
 /*
  * Perform necessary cache operations to ensure that data previously
-- 
1.8.0

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 1/3] ARM: cacheflush: don't round address range up to nearest page
  2013-03-25 18:18 ` [PATCH 1/3] ARM: cacheflush: don't round address range up to nearest page Will Deacon
@ 2013-03-27 11:05   ` Catalin Marinas
  0 siblings, 0 replies; 13+ messages in thread
From: Catalin Marinas @ 2013-03-27 11:05 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Mar 25, 2013 at 06:18:04PM +0000, Will Deacon wrote:
> The flush_cache_user_range macro takes a pair of addresses describing
> the start and end of the virtual address range to flush. Due to an
> accidental oversight when flush_cache_range_user was introduced, the
> address range was rounded up so that the start and end addresses were
> page-aligned.

I don't think it was an accidental oversight but it was actually
covering other issue where writing an instruction caused a full page
CoW. We've fixed this issue since, so the patch is ok.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 2/3] ARM: cacheflush: don't bother rounding to nearest vma
  2013-03-25 18:18 [PATCH 0/3] Optimise cache-flushing system call and add iovec variant Will Deacon
  2013-03-25 18:18 ` [PATCH 1/3] ARM: cacheflush: don't round address range up to nearest page Will Deacon
@ 2013-03-25 18:18 ` Will Deacon
  2013-03-27 11:09   ` Catalin Marinas
  2013-03-25 18:18 ` [PATCH 3/3] ARM: cacheflush: add new iovec-based cache flushing system call Will Deacon
  2013-03-25 18:44 ` [PATCH 0/3] Optimise cache-flushing system call and add iovec variant Jonathan Austin
  3 siblings, 1 reply; 13+ messages in thread
From: Will Deacon @ 2013-03-25 18:18 UTC (permalink / raw)
  To: linux-arm-kernel

do_cache_op finds the lowest VMA contained in the specified address
range and rounds the range to cover only the mapped addresses.

Since commit 4542b6a0fa6b ("ARM: 7365/1: drop unused parameter from
flush_cache_user_range") the VMA is not used for anything else in this
code and seeing as the low-level cache flushing routines return -EFAULT
if the address is not valid, there is no need for this range truncation.

This patch removes the VMA handling code from the cacheflushing syscall.

Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 arch/arm/kernel/traps.c | 17 +----------------
 1 file changed, 1 insertion(+), 16 deletions(-)

diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
index 1c08911..da5e268 100644
--- a/arch/arm/kernel/traps.c
+++ b/arch/arm/kernel/traps.c
@@ -509,25 +509,10 @@ static int bad_syscall(int n, struct pt_regs *regs)
 static inline int
 do_cache_op(unsigned long start, unsigned long end, int flags)
 {
-	struct mm_struct *mm = current->active_mm;
-	struct vm_area_struct *vma;
-
 	if (end < start || flags)
 		return -EINVAL;
 
-	down_read(&mm->mmap_sem);
-	vma = find_vma(mm, start);
-	if (vma && vma->vm_start < end) {
-		if (start < vma->vm_start)
-			start = vma->vm_start;
-		if (end > vma->vm_end)
-			end = vma->vm_end;
-
-		up_read(&mm->mmap_sem);
-		return flush_cache_user_range(start, end);
-	}
-	up_read(&mm->mmap_sem);
-	return -EINVAL;
+	return flush_cache_user_range(start, end);
 }
 
 /*
-- 
1.8.0

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/3] ARM: cacheflush: don't bother rounding to nearest vma
  2013-03-25 18:18 ` [PATCH 2/3] ARM: cacheflush: don't bother rounding to nearest vma Will Deacon
@ 2013-03-27 11:09   ` Catalin Marinas
  2013-03-27 12:15     ` Will Deacon
  0 siblings, 1 reply; 13+ messages in thread
From: Catalin Marinas @ 2013-03-27 11:09 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Mar 25, 2013 at 06:18:05PM +0000, Will Deacon wrote:
> do_cache_op finds the lowest VMA contained in the specified address
> range and rounds the range to cover only the mapped addresses.
> 
> Since commit 4542b6a0fa6b ("ARM: 7365/1: drop unused parameter from
> flush_cache_user_range") the VMA is not used for anything else in this
> code and seeing as the low-level cache flushing routines return -EFAULT
> if the address is not valid, there is no need for this range truncation.
> 
> This patch removes the VMA handling code from the cacheflushing syscall.
> 
> Signed-off-by: Will Deacon <will.deacon@arm.com>
> ---
>  arch/arm/kernel/traps.c | 17 +----------------
>  1 file changed, 1 insertion(+), 16 deletions(-)
> 
> diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
> index 1c08911..da5e268 100644
> --- a/arch/arm/kernel/traps.c
> +++ b/arch/arm/kernel/traps.c
> @@ -509,25 +509,10 @@ static int bad_syscall(int n, struct pt_regs *regs)
>  static inline int
>  do_cache_op(unsigned long start, unsigned long end, int flags)
>  {
> -	struct mm_struct *mm = current->active_mm;
> -	struct vm_area_struct *vma;
> -
>  	if (end < start || flags)
>  		return -EINVAL;
>  
> -	down_read(&mm->mmap_sem);
> -	vma = find_vma(mm, start);
> -	if (vma && vma->vm_start < end) {
> -		if (start < vma->vm_start)
> -			start = vma->vm_start;
> -		if (end > vma->vm_end)
> -			end = vma->vm_end;
> -
> -		up_read(&mm->mmap_sem);
> -		return flush_cache_user_range(start, end);
> -	}
> -	up_read(&mm->mmap_sem);
> -	return -EINVAL;
> +	return flush_cache_user_range(start, end);

While this would work, it introduces a possibility of DoS where an
application passes bigger valid range (kernel linear mapping) and the
kernel code would not be preempted (CONFIG_PREEMPT disabled). IIRC,
that's why Russell reject such patch a while back.

-- 
Catalin

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 2/3] ARM: cacheflush: don't bother rounding to nearest vma
  2013-03-27 11:09   ` Catalin Marinas
@ 2013-03-27 12:15     ` Will Deacon
  2013-03-27 12:21       ` Catalin Marinas
  0 siblings, 1 reply; 13+ messages in thread
From: Will Deacon @ 2013-03-27 12:15 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

On Wed, Mar 27, 2013 at 11:09:38AM +0000, Catalin Marinas wrote:
> On Mon, Mar 25, 2013 at 06:18:05PM +0000, Will Deacon wrote:
> > diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
> > index 1c08911..da5e268 100644
> > --- a/arch/arm/kernel/traps.c
> > +++ b/arch/arm/kernel/traps.c
> > @@ -509,25 +509,10 @@ static int bad_syscall(int n, struct pt_regs *regs)
> >  static inline int
> >  do_cache_op(unsigned long start, unsigned long end, int flags)
> >  {
> > -	struct mm_struct *mm = current->active_mm;
> > -	struct vm_area_struct *vma;
> > -
> >  	if (end < start || flags)
> >  		return -EINVAL;
> >  
> > -	down_read(&mm->mmap_sem);
> > -	vma = find_vma(mm, start);
> > -	if (vma && vma->vm_start < end) {
> > -		if (start < vma->vm_start)
> > -			start = vma->vm_start;
> > -		if (end > vma->vm_end)
> > -			end = vma->vm_end;
> > -
> > -		up_read(&mm->mmap_sem);
> > -		return flush_cache_user_range(start, end);
> > -	}
> > -	up_read(&mm->mmap_sem);
> > -	return -EINVAL;
> > +	return flush_cache_user_range(start, end);
> 
> While this would work, it introduces a possibility of DoS where an
> application passes bigger valid range (kernel linear mapping) and the
> kernel code would not be preempted (CONFIG_PREEMPT disabled). IIRC,
> that's why Russell reject such patch a while back.

Hmm, I'm not sure I buy that argument. Firstly, you can't just pass a kernel
linear mapping address -- we'll fault straight away because it's not a
userspace address. Secondly, what's to stop an application from mmaping a
large area into a single VMA and giving rise to the same situation? Finally,
interrupts are enabled during this operation, so I don't understand how you
can trigger a DoS, irrespective of the preempt configuration.

Is there an old thread I can refer to with more details about this? It may
be that some of the assumptions there no longer hold with subsequent changes
to the fault handling on this path.

Will

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 2/3] ARM: cacheflush: don't bother rounding to nearest vma
  2013-03-27 12:15     ` Will Deacon
@ 2013-03-27 12:21       ` Catalin Marinas
  2013-03-27 12:43         ` Will Deacon
  0 siblings, 1 reply; 13+ messages in thread
From: Catalin Marinas @ 2013-03-27 12:21 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Mar 27, 2013 at 12:15:12PM +0000, Will Deacon wrote:
> On Wed, Mar 27, 2013 at 11:09:38AM +0000, Catalin Marinas wrote:
> > On Mon, Mar 25, 2013 at 06:18:05PM +0000, Will Deacon wrote:
> > > diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
> > > index 1c08911..da5e268 100644
> > > --- a/arch/arm/kernel/traps.c
> > > +++ b/arch/arm/kernel/traps.c
> > > @@ -509,25 +509,10 @@ static int bad_syscall(int n, struct pt_regs *regs)
> > >  static inline int
> > >  do_cache_op(unsigned long start, unsigned long end, int flags)
> > >  {
> > > -	struct mm_struct *mm = current->active_mm;
> > > -	struct vm_area_struct *vma;
> > > -
> > >  	if (end < start || flags)
> > >  		return -EINVAL;
> > >  
> > > -	down_read(&mm->mmap_sem);
> > > -	vma = find_vma(mm, start);
> > > -	if (vma && vma->vm_start < end) {
> > > -		if (start < vma->vm_start)
> > > -			start = vma->vm_start;
> > > -		if (end > vma->vm_end)
> > > -			end = vma->vm_end;
> > > -
> > > -		up_read(&mm->mmap_sem);
> > > -		return flush_cache_user_range(start, end);
> > > -	}
> > > -	up_read(&mm->mmap_sem);
> > > -	return -EINVAL;
> > > +	return flush_cache_user_range(start, end);
> > 
> > While this would work, it introduces a possibility of DoS where an
> > application passes bigger valid range (kernel linear mapping) and the
> > kernel code would not be preempted (CONFIG_PREEMPT disabled). IIRC,
> > that's why Russell reject such patch a while back.
> 
> Hmm, I'm not sure I buy that argument. Firstly, you can't just pass a kernel
> linear mapping address -- we'll fault straight away because it's not a
> userspace address.

Fault where?

> Secondly, what's to stop an application from mmaping a large area into
> a single VMA and giving rise to the same situation? Finally,
> interrupts are enabled during this operation, so I don't understand
> how you can trigger a DoS, irrespective of the preempt configuration.

You can prevent context switching to other threads. But I agree, with a
large vma (which is already faulted in), you can get similar behaviour.

> Is there an old thread I can refer to with more details about this? It may
> be that some of the assumptions there no longer hold with subsequent changes
> to the fault handling on this path.

You could search the list for "do_cache_op", I don't have any at hand.

-- 
Catalin

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 2/3] ARM: cacheflush: don't bother rounding to nearest vma
  2013-03-27 12:21       ` Catalin Marinas
@ 2013-03-27 12:43         ` Will Deacon
  2013-03-27 13:08           ` Catalin Marinas
  0 siblings, 1 reply; 13+ messages in thread
From: Will Deacon @ 2013-03-27 12:43 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Mar 27, 2013 at 12:21:59PM +0000, Catalin Marinas wrote:
> On Wed, Mar 27, 2013 at 12:15:12PM +0000, Will Deacon wrote:
> > On Wed, Mar 27, 2013 at 11:09:38AM +0000, Catalin Marinas wrote:
> > > While this would work, it introduces a possibility of DoS where an
> > > application passes bigger valid range (kernel linear mapping) and the
> > > kernel code would not be preempted (CONFIG_PREEMPT disabled). IIRC,
> > > that's why Russell reject such patch a while back.
> > 
> > Hmm, I'm not sure I buy that argument. Firstly, you can't just pass a kernel
> > linear mapping address -- we'll fault straight away because it's not a
> > userspace address.
> 
> Fault where?

I was expecting something like an access_ok check, but you're right, we
don't have one (and I guess that's not strictly needed given that flushing
isn't destructive). I still find it a bit scary that we allow userspace to
pass kernel addresses through though -- especially if there's something like
a DMA or CPU suspend operation running on another core.

> > Secondly, what's to stop an application from mmaping a large area into
> > a single VMA and giving rise to the same situation? Finally,
> > interrupts are enabled during this operation, so I don't understand
> > how you can trigger a DoS, irrespective of the preempt configuration.
> 
> You can prevent context switching to other threads. But I agree, with a
> large vma (which is already faulted in), you can get similar behaviour.

So the easy fix is to split the range up into chunks and call cond_resched
after processing each one.

Will

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 2/3] ARM: cacheflush: don't bother rounding to nearest vma
  2013-03-27 12:43         ` Will Deacon
@ 2013-03-27 13:08           ` Catalin Marinas
  0 siblings, 0 replies; 13+ messages in thread
From: Catalin Marinas @ 2013-03-27 13:08 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Mar 27, 2013 at 12:43:52PM +0000, Will Deacon wrote:
> On Wed, Mar 27, 2013 at 12:21:59PM +0000, Catalin Marinas wrote:
> > On Wed, Mar 27, 2013 at 12:15:12PM +0000, Will Deacon wrote:
> > > On Wed, Mar 27, 2013 at 11:09:38AM +0000, Catalin Marinas wrote:
> > > > While this would work, it introduces a possibility of DoS where an
> > > > application passes bigger valid range (kernel linear mapping) and the
> > > > kernel code would not be preempted (CONFIG_PREEMPT disabled). IIRC,
> > > > that's why Russell reject such patch a while back.
> > > 
> > > Hmm, I'm not sure I buy that argument. Firstly, you can't just pass a kernel
> > > linear mapping address -- we'll fault straight away because it's not a
> > > userspace address.
> > 
> > Fault where?
> 
> I was expecting something like an access_ok check, but you're right, we
> don't have one (and I guess that's not strictly needed given that flushing
> isn't destructive). I still find it a bit scary that we allow userspace to
> pass kernel addresses through though -- especially if there's something like
> a DMA or CPU suspend operation running on another core.

Currently we don't allow kernel addresses since we require a valid vma.
If we drop the vma search, we should add an access_ok.

> > > Secondly, what's to stop an application from mmaping a large area into
> > > a single VMA and giving rise to the same situation? Finally,
> > > interrupts are enabled during this operation, so I don't understand
> > > how you can trigger a DoS, irrespective of the preempt configuration.
> > 
> > You can prevent context switching to other threads. But I agree, with a
> > large vma (which is already faulted in), you can get similar behaviour.
> 
> So the easy fix is to split the range up into chunks and call cond_resched
> after processing each one.

This would work.

-- 
Catalin

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 3/3] ARM: cacheflush: add new iovec-based cache flushing system call
  2013-03-25 18:18 [PATCH 0/3] Optimise cache-flushing system call and add iovec variant Will Deacon
  2013-03-25 18:18 ` [PATCH 1/3] ARM: cacheflush: don't round address range up to nearest page Will Deacon
  2013-03-25 18:18 ` [PATCH 2/3] ARM: cacheflush: don't bother rounding to nearest vma Will Deacon
@ 2013-03-25 18:18 ` Will Deacon
  2013-03-27 11:12   ` Catalin Marinas
  2013-03-25 18:44 ` [PATCH 0/3] Optimise cache-flushing system call and add iovec variant Jonathan Austin
  3 siblings, 1 reply; 13+ messages in thread
From: Will Deacon @ 2013-03-25 18:18 UTC (permalink / raw)
  To: linux-arm-kernel

The cacheflush system call flushes a contiguous range, specified by a
pair of addresses. This means that user applications wishing to flush
discrete ranges must issue multiple system calls, since attempting to
flush past a vma results in range truncation.

This patch introduces a new private system call for ARM, cacheflush_iov,
which processes a range of addresses described in an iovec structure.

Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 arch/arm/include/uapi/asm/unistd.h |  1 +
 arch/arm/kernel/traps.c            | 36 ++++++++++++++++++++++++++++++++++++
 2 files changed, 37 insertions(+)

diff --git a/arch/arm/include/uapi/asm/unistd.h b/arch/arm/include/uapi/asm/unistd.h
index af33b44..bcad38c 100644
--- a/arch/arm/include/uapi/asm/unistd.h
+++ b/arch/arm/include/uapi/asm/unistd.h
@@ -421,6 +421,7 @@
 #define __ARM_NR_usr26			(__ARM_NR_BASE+3)
 #define __ARM_NR_usr32			(__ARM_NR_BASE+4)
 #define __ARM_NR_set_tls		(__ARM_NR_BASE+5)
+#define __ARM_NR_cacheflush_iov		(__ARM_NR_BASE+6)
 
 /*
  * *NOTE*: This is a ghost syscall private to the kernel.  Only the
diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
index da5e268..f83eed6 100644
--- a/arch/arm/kernel/traps.c
+++ b/arch/arm/kernel/traps.c
@@ -25,6 +25,7 @@
 #include <linux/delay.h>
 #include <linux/init.h>
 #include <linux/sched.h>
+#include <linux/slab.h>
 
 #include <linux/atomic.h>
 #include <asm/cacheflush.h>
@@ -515,6 +516,37 @@ do_cache_op(unsigned long start, unsigned long end, int flags)
 	return flush_cache_user_range(start, end);
 }
 
+static inline int
+do_cache_op_iov(const struct iovec __user *uiov, unsigned long cnt, int flags)
+{
+	int i, ret = 0;
+	unsigned long len = cnt * sizeof(struct iovec);
+	struct iovec *iov = kmalloc(len, GFP_KERNEL);
+
+	if (iov == NULL) {
+		ret = -ENOMEM;
+		goto out;
+	}
+
+	if (copy_from_user(iov, uiov, len)) {
+		ret = -EFAULT;
+		goto out_free;
+	}
+
+	for (i = 0; i < cnt; ++i) {
+		unsigned long start = (unsigned long __force)iov[i].iov_base;
+		unsigned long end = start + iov[i].iov_len;
+		ret = do_cache_op(start, end, flags);
+		if (ret)
+			break;
+	}
+
+out_free:
+	kfree(iov);
+out:
+	return ret;
+}
+
 /*
  * Handle all unrecognised system calls.
  *  0x9f0000 - 0x9fffff are some more esoteric system calls
@@ -560,6 +592,10 @@ asmlinkage int arm_syscall(int no, struct pt_regs *regs)
 	case NR(cacheflush):
 		return do_cache_op(regs->ARM_r0, regs->ARM_r1, regs->ARM_r2);
 
+	case NR(cacheflush_iov):
+		return do_cache_op_iov((const struct iovec __user *)regs->ARM_r0,
+					regs->ARM_r1, regs->ARM_r2);
+
 	case NR(usr26):
 		if (!(elf_hwcap & HWCAP_26BIT))
 			break;
-- 
1.8.0

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/3] ARM: cacheflush: add new iovec-based cache flushing system call
  2013-03-25 18:18 ` [PATCH 3/3] ARM: cacheflush: add new iovec-based cache flushing system call Will Deacon
@ 2013-03-27 11:12   ` Catalin Marinas
  2013-05-23 10:52     ` Will Deacon
  0 siblings, 1 reply; 13+ messages in thread
From: Catalin Marinas @ 2013-03-27 11:12 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Mar 25, 2013 at 06:18:06PM +0000, Will Deacon wrote:
> The cacheflush system call flushes a contiguous range, specified by a
> pair of addresses. This means that user applications wishing to flush
> discrete ranges must issue multiple system calls, since attempting to
> flush past a vma results in range truncation.
> 
> This patch introduces a new private system call for ARM, cacheflush_iov,
> which processes a range of addresses described in an iovec structure.
> 
> Signed-off-by: Will Deacon <will.deacon@arm.com>
> ---
>  arch/arm/include/uapi/asm/unistd.h |  1 +
>  arch/arm/kernel/traps.c            | 36 ++++++++++++++++++++++++++++++++++++
>  2 files changed, 37 insertions(+)
> 
> diff --git a/arch/arm/include/uapi/asm/unistd.h b/arch/arm/include/uapi/asm/unistd.h
> index af33b44..bcad38c 100644
> --- a/arch/arm/include/uapi/asm/unistd.h
> +++ b/arch/arm/include/uapi/asm/unistd.h
> @@ -421,6 +421,7 @@
>  #define __ARM_NR_usr26			(__ARM_NR_BASE+3)
>  #define __ARM_NR_usr32			(__ARM_NR_BASE+4)
>  #define __ARM_NR_set_tls		(__ARM_NR_BASE+5)
> +#define __ARM_NR_cacheflush_iov		(__ARM_NR_BASE+6)
>  
>  /*
>   * *NOTE*: This is a ghost syscall private to the kernel.  Only the
> diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
> index da5e268..f83eed6 100644
> --- a/arch/arm/kernel/traps.c
> +++ b/arch/arm/kernel/traps.c
> @@ -25,6 +25,7 @@
>  #include <linux/delay.h>
>  #include <linux/init.h>
>  #include <linux/sched.h>
> +#include <linux/slab.h>
>  
>  #include <linux/atomic.h>
>  #include <asm/cacheflush.h>
> @@ -515,6 +516,37 @@ do_cache_op(unsigned long start, unsigned long end, int flags)
>  	return flush_cache_user_range(start, end);
>  }
>  
> +static inline int
> +do_cache_op_iov(const struct iovec __user *uiov, unsigned long cnt, int flags)
> +{
> +	int i, ret = 0;
> +	unsigned long len = cnt * sizeof(struct iovec);
> +	struct iovec *iov = kmalloc(len, GFP_KERNEL);
> +
> +	if (iov == NULL) {
> +		ret = -ENOMEM;
> +		goto out;
> +	}
> +
> +	if (copy_from_user(iov, uiov, len)) {
> +		ret = -EFAULT;
> +		goto out_free;
> +	}
> +
> +	for (i = 0; i < cnt; ++i) {
> +		unsigned long start = (unsigned long __force)iov[i].iov_base;
> +		unsigned long end = start + iov[i].iov_len;
> +		ret = do_cache_op(start, end, flags);
> +		if (ret)
> +			break;
> +	}

Could you use get_user() on struct iovec fields directly to avoid
kmalloc? It would be slightly faster.

-- 
Catalin

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 3/3] ARM: cacheflush: add new iovec-based cache flushing system call
  2013-03-27 11:12   ` Catalin Marinas
@ 2013-05-23 10:52     ` Will Deacon
  0 siblings, 0 replies; 13+ messages in thread
From: Will Deacon @ 2013-05-23 10:52 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Mar 27, 2013 at 11:12:11AM +0000, Catalin Marinas wrote:
> On Mon, Mar 25, 2013 at 06:18:06PM +0000, Will Deacon wrote:
> > +static inline int
> > +do_cache_op_iov(const struct iovec __user *uiov, unsigned long cnt, int flags)
> > +{
> > +	int i, ret = 0;
> > +	unsigned long len = cnt * sizeof(struct iovec);
> > +	struct iovec *iov = kmalloc(len, GFP_KERNEL);
> > +
> > +	if (iov == NULL) {
> > +		ret = -ENOMEM;
> > +		goto out;
> > +	}
> > +
> > +	if (copy_from_user(iov, uiov, len)) {
> > +		ret = -EFAULT;
> > +		goto out_free;
> > +	}
> > +
> > +	for (i = 0; i < cnt; ++i) {
> > +		unsigned long start = (unsigned long __force)iov[i].iov_base;
> > +		unsigned long end = start + iov[i].iov_len;
> > +		ret = do_cache_op(start, end, flags);
> > +		if (ret)
> > +			break;
> > +	}
> 
> Could you use get_user() on struct iovec fields directly to avoid
> kmalloc? It would be slightly faster.

I just had a look at this, but the problem is that we want to know the
entire (combined) region size so that we may just decide to flush the entire
cache, rather than iterate over each region line-by-line. This requires
copying the whole iovec into kernel space up front, so that we can iterate
over the regions with iov_length.

Will

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 0/3] Optimise cache-flushing system call and add iovec variant
  2013-03-25 18:18 [PATCH 0/3] Optimise cache-flushing system call and add iovec variant Will Deacon
                   ` (2 preceding siblings ...)
  2013-03-25 18:18 ` [PATCH 3/3] ARM: cacheflush: add new iovec-based cache flushing system call Will Deacon
@ 2013-03-25 18:44 ` Jonathan Austin
  3 siblings, 0 replies; 13+ messages in thread
From: Jonathan Austin @ 2013-03-25 18:44 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Will,

On 25/03/13 18:18, Will Deacon wrote:
> Hello,
>
> This patch series has been sitting on my box for some time, pending
> benchmark results and analysis.
>
> The results are that, with the first two patches in this series, we see
> ~2% improvement in browser benchmark scores, which flush the cache fairly
> regularly in response to jitting JS code.
>
> The final patch is more of an RFC, since I don't have any software that
> uses it outside of my toy programs.
>
> All feedback welcome,
>

For all three patches
Tested-by: Jonathan Austin <Jonathan.Austin@arm.com>

The last one was tested with a modified jit that was hacked to use the 
new API as a direct replacement for the old one (IE no cleverness with 
delayed flushing/grouping of calls). The performance decrease from using 
the new call instead of the old one was well within the noise of the 
tests. We are, however, yet to test the improvement from actually 
*exploiting* the new mechanism. If there's a positive response to the 
idea then I can encourage the jit guys to have another look.

Jonny

> Will
>
>
> Will Deacon (3):
>    ARM: cacheflush: don't round address range up to nearest page
>    ARM: cacheflush: don't bother rounding to nearest vma
>    ARM: cacheflush: add new iovec-based cache flushing system call
>
>   arch/arm/include/asm/cacheflush.h  |  3 +--
>   arch/arm/include/uapi/asm/unistd.h |  1 +
>   arch/arm/kernel/traps.c            | 49 +++++++++++++++++++++++++++-----------
>   3 files changed, 37 insertions(+), 16 deletions(-)
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2013-05-23 10:52 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-03-25 18:18 [PATCH 0/3] Optimise cache-flushing system call and add iovec variant Will Deacon
2013-03-25 18:18 ` [PATCH 1/3] ARM: cacheflush: don't round address range up to nearest page Will Deacon
2013-03-27 11:05   ` Catalin Marinas
2013-03-25 18:18 ` [PATCH 2/3] ARM: cacheflush: don't bother rounding to nearest vma Will Deacon
2013-03-27 11:09   ` Catalin Marinas
2013-03-27 12:15     ` Will Deacon
2013-03-27 12:21       ` Catalin Marinas
2013-03-27 12:43         ` Will Deacon
2013-03-27 13:08           ` Catalin Marinas
2013-03-25 18:18 ` [PATCH 3/3] ARM: cacheflush: add new iovec-based cache flushing system call Will Deacon
2013-03-27 11:12   ` Catalin Marinas
2013-05-23 10:52     ` Will Deacon
2013-03-25 18:44 ` [PATCH 0/3] Optimise cache-flushing system call and add iovec variant Jonathan Austin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).