linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] [ARM] Do not call flush_cache_user_range with mmap_sem held
@ 2010-04-28  7:32 Dima Zavin
  2010-04-28  7:35 ` Dima Zavin
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Dima Zavin @ 2010-04-28  7:32 UTC (permalink / raw)
  To: linux-arm-kernel

We can't be holding the mmap_sem while calling flush_cache_user_range
because the flush can fault. If we fault on a user address, the
page fault handler will try to take mmap_sem again. Since both places
acquire the read lock, most of the time it succeeds. However, if another
thread tries to acquire the write lock on the mmap_sem (e.g. mmap) in
between the call to flush_cache_user_range and the fault, the down_read
in do_page_fault will deadlock.

Also, since we really can't be holding the mmap_sem while calling
flush_cache_user_range AND vma is actually unused by the flush itself,
get rid of vma as an argument.

Signed-off-by: Dima Zavin <dima@android.com>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Arve Hj?nnev?g <arve@android.com>
---
 arch/arm/include/asm/cacheflush.h |    2 +-
 arch/arm/kernel/traps.c           |    4 +++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/arm/include/asm/cacheflush.h b/arch/arm/include/asm/cacheflush.h
index 0d08d41..b68a2b4 100644
--- a/arch/arm/include/asm/cacheflush.h
+++ b/arch/arm/include/asm/cacheflush.h
@@ -336,7 +336,7 @@ extern void flush_cache_page(struct vm_area_struct *vma, unsigned long user_addr
  * Harvard caches are synchronised for the user space address range.
  * This is used for the ARM private sys_cacheflush system call.
  */
-#define flush_cache_user_range(vma,start,end) \
+#define flush_cache_user_range(start,end) \
 	__cpuc_coherent_user_range((start) & PAGE_MASK, PAGE_ALIGN(end))
 
 /*
diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
index 1621e53..2455fd3 100644
--- a/arch/arm/kernel/traps.c
+++ b/arch/arm/kernel/traps.c
@@ -453,7 +453,9 @@ do_cache_op(unsigned long start, unsigned long end, int flags)
 		if (end > vma->vm_end)
 			end = vma->vm_end;
 
-		flush_cache_user_range(vma, start, end);
+		up_read(&mm->mmap_sem);
+		flush_cache_user_range(start, end);
+		return;
 	}
 	up_read(&mm->mmap_sem);
 }
-- 
1.6.6

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH] [ARM] Do not call flush_cache_user_range with mmap_sem held
  2010-04-28  7:32 [PATCH] [ARM] Do not call flush_cache_user_range with mmap_sem held Dima Zavin
@ 2010-04-28  7:35 ` Dima Zavin
  2010-04-29 13:00 ` Russell King - ARM Linux
  2010-05-06 15:08 ` [PATCH] [ARM] Do not call flush_cache_user_range with mmap_sem held Catalin Marinas
  2 siblings, 0 replies; 14+ messages in thread
From: Dima Zavin @ 2010-04-28  7:35 UTC (permalink / raw)
  To: linux-arm-kernel

This patch is probably not as efficient as it could be since the flush
will end up faulting in pages the user never touched. I think we
should print a warning regarding that and skip over the unmapped
pages, I just wasn't sure how to do that correctly.

--Dima

2010/4/28 Dima Zavin <dima@android.com>:
> We can't be holding the mmap_sem while calling flush_cache_user_range
> because the flush can fault. If we fault on a user address, the
> page fault handler will try to take mmap_sem again. Since both places
> acquire the read lock, most of the time it succeeds. However, if another
> thread tries to acquire the write lock on the mmap_sem (e.g. mmap) in
> between the call to flush_cache_user_range and the fault, the down_read
> in do_page_fault will deadlock.
>
> Also, since we really can't be holding the mmap_sem while calling
> flush_cache_user_range AND vma is actually unused by the flush itself,
> get rid of vma as an argument.
>
> Signed-off-by: Dima Zavin <dima@android.com>
> Cc: Russell King <rmk+kernel@arm.linux.org.uk>
> Cc: Arve Hj?nnev?g <arve@android.com>
> ---
> ?arch/arm/include/asm/cacheflush.h | ? ?2 +-
> ?arch/arm/kernel/traps.c ? ? ? ? ? | ? ?4 +++-
> ?2 files changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm/include/asm/cacheflush.h b/arch/arm/include/asm/cacheflush.h
> index 0d08d41..b68a2b4 100644
> --- a/arch/arm/include/asm/cacheflush.h
> +++ b/arch/arm/include/asm/cacheflush.h
> @@ -336,7 +336,7 @@ extern void flush_cache_page(struct vm_area_struct *vma, unsigned long user_addr
> ?* Harvard caches are synchronised for the user space address range.
> ?* This is used for the ARM private sys_cacheflush system call.
> ?*/
> -#define flush_cache_user_range(vma,start,end) \
> +#define flush_cache_user_range(start,end) \
> ? ? ? ?__cpuc_coherent_user_range((start) & PAGE_MASK, PAGE_ALIGN(end))
>
> ?/*
> diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
> index 1621e53..2455fd3 100644
> --- a/arch/arm/kernel/traps.c
> +++ b/arch/arm/kernel/traps.c
> @@ -453,7 +453,9 @@ do_cache_op(unsigned long start, unsigned long end, int flags)
> ? ? ? ? ? ? ? ?if (end > vma->vm_end)
> ? ? ? ? ? ? ? ? ? ? ? ?end = vma->vm_end;
>
> - ? ? ? ? ? ? ? flush_cache_user_range(vma, start, end);
> + ? ? ? ? ? ? ? up_read(&mm->mmap_sem);
> + ? ? ? ? ? ? ? flush_cache_user_range(start, end);
> + ? ? ? ? ? ? ? return;
> ? ? ? ?}
> ? ? ? ?up_read(&mm->mmap_sem);
> ?}
> --
> 1.6.6
>
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] [ARM] Do not call flush_cache_user_range with mmap_sem held
  2010-04-28  7:32 [PATCH] [ARM] Do not call flush_cache_user_range with mmap_sem held Dima Zavin
  2010-04-28  7:35 ` Dima Zavin
@ 2010-04-29 13:00 ` Russell King - ARM Linux
  2010-04-29 18:16   ` Jamie Lokier
  2010-05-06 15:08 ` [PATCH] [ARM] Do not call flush_cache_user_range with mmap_sem held Catalin Marinas
  2 siblings, 1 reply; 14+ messages in thread
From: Russell King - ARM Linux @ 2010-04-29 13:00 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Apr 28, 2010 at 12:32:11AM -0700, Dima Zavin wrote:
> We can't be holding the mmap_sem while calling flush_cache_user_range
> because the flush can fault. If we fault on a user address, the
> page fault handler will try to take mmap_sem again. Since both places
> acquire the read lock, most of the time it succeeds. However, if another
> thread tries to acquire the write lock on the mmap_sem (e.g. mmap) in
> between the call to flush_cache_user_range and the fault, the down_read
> in do_page_fault will deadlock.

That's a non-issue.  If do_cache_op is holding a read lock, _nothing_
_else_ can be holding that lock in write mode.  So, holding the lock in
read mode ensures that when faults occur, the fault handler can be sure
that its read lock will succeed.

Other threads trying to get a write lock will be prevented from doing
so until the entire cache flush operation has completed, which ensures
that the mapping can not be altered via munmap/mmap - and that is
completely desirable behaviour.

Please explain why you think this leads to a deadlock.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] [ARM] Do not call flush_cache_user_range with mmap_sem held
  2010-04-29 13:00 ` Russell King - ARM Linux
@ 2010-04-29 18:16   ` Jamie Lokier
  2010-04-29 18:24     ` Russell King - ARM Linux
  0 siblings, 1 reply; 14+ messages in thread
From: Jamie Lokier @ 2010-04-29 18:16 UTC (permalink / raw)
  To: linux-arm-kernel

Russell King - ARM Linux wrote:
> On Wed, Apr 28, 2010 at 12:32:11AM -0700, Dima Zavin wrote:
> > We can't be holding the mmap_sem while calling flush_cache_user_range
> > because the flush can fault. If we fault on a user address, the
> > page fault handler will try to take mmap_sem again. Since both places
> > acquire the read lock, most of the time it succeeds. However, if another
> > thread tries to acquire the write lock on the mmap_sem (e.g. mmap) in
> > between the call to flush_cache_user_range and the fault, the down_read
> > in do_page_fault will deadlock.
> 
> That's a non-issue.  If do_cache_op is holding a read lock, _nothing_
> _else_ can be holding that lock in write mode.  So, holding the lock in
> read mode ensures that when faults occur, the fault handler can be sure
> that its read lock will succeed.

read-write locks will block a reader when there is a already blocked
writer.  Otherwise the writer can be permanently starved due to new
readers always arriving so reader count doesn't reach zero.

In other words, readers queue behind writers if there's a waiting writer.

-- Jamie

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] [ARM] Do not call flush_cache_user_range with mmap_sem held
  2010-04-29 18:16   ` Jamie Lokier
@ 2010-04-29 18:24     ` Russell King - ARM Linux
  2010-04-29 19:23       ` Dima Zavin
  2010-05-06 15:00       ` [PATCH] [ARM] Do not call flush_cache_user_range with mmap_semheld Catalin Marinas
  0 siblings, 2 replies; 14+ messages in thread
From: Russell King - ARM Linux @ 2010-04-29 18:24 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Apr 29, 2010 at 07:16:36PM +0100, Jamie Lokier wrote:
> Russell King - ARM Linux wrote:
> > On Wed, Apr 28, 2010 at 12:32:11AM -0700, Dima Zavin wrote:
> > > We can't be holding the mmap_sem while calling flush_cache_user_range
> > > because the flush can fault. If we fault on a user address, the
> > > page fault handler will try to take mmap_sem again. Since both places
> > > acquire the read lock, most of the time it succeeds. However, if another
> > > thread tries to acquire the write lock on the mmap_sem (e.g. mmap) in
> > > between the call to flush_cache_user_range and the fault, the down_read
> > > in do_page_fault will deadlock.
> > 
> > That's a non-issue.  If do_cache_op is holding a read lock, _nothing_
> > _else_ can be holding that lock in write mode.  So, holding the lock in
> > read mode ensures that when faults occur, the fault handler can be sure
> > that its read lock will succeed.
> 
> read-write locks will block a reader when there is a already blocked
> writer.  Otherwise the writer can be permanently starved due to new
> readers always arriving so reader count doesn't reach zero.

Hmm, true, and rather unfortunate.

As I've already said, we can not do this cache maintainence outside of
the lock.

The point of this code is to first validate that the region we're working
on is valid.  As soon as we drop the lock, we lose the guarantee that
the region is valid while we operate on it - indeed, the region could be
unmapped and remapped by a different thread.

I think the only reasonable solution is to also walk the page tables
and do the cache handling on a per-page basis, which will make this
interface quite a bit slower - but that's the price we pay for
correctness.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] [ARM] Do not call flush_cache_user_range with mmap_sem held
  2010-04-29 18:24     ` Russell King - ARM Linux
@ 2010-04-29 19:23       ` Dima Zavin
  2010-05-04  4:07         ` Dima Zavin
  2010-05-06 15:00       ` [PATCH] [ARM] Do not call flush_cache_user_range with mmap_semheld Catalin Marinas
  1 sibling, 1 reply; 14+ messages in thread
From: Dima Zavin @ 2010-04-29 19:23 UTC (permalink / raw)
  To: linux-arm-kernel

Jamie/Russell,

Thanks for the feedback.

>> read-write locks will block a reader when there is a already blocked
>> writer. ?Otherwise the writer can be permanently starved due to new
>> readers always arriving so reader count doesn't reach zero.

That's precisely the problem.

> Hmm, true, and rather unfortunate.
>
> As I've already said, we can not do this cache maintainence outside of
> the lock.
>
> The point of this code is to first validate that the region we're working
> on is valid. ?As soon as we drop the lock, we lose the guarantee that
> the region is valid while we operate on it - indeed, the region could be
> unmapped and remapped by a different thread.

So what if it was remapped? The worst case scenario in this case is
that we needlessly flush a region of memory, but its not "wrong". It
can't be any worse than just doing a full cache flush. If another
thread unmapped the region, then we should (and will) segfault the
flushing thread, which is the correct behavior IMHO.

--Dima

> I think the only reasonable solution is to also walk the page tables
> and do the cache handling on a per-page basis, which will make this
> interface quite a bit slower - but that's the price we pay for
> correctness.


>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] [ARM] Do not call flush_cache_user_range with mmap_sem held
  2010-04-29 19:23       ` Dima Zavin
@ 2010-05-04  4:07         ` Dima Zavin
  2010-05-04  7:40           ` Russell King - ARM Linux
  0 siblings, 1 reply; 14+ messages in thread
From: Dima Zavin @ 2010-05-04  4:07 UTC (permalink / raw)
  To: linux-arm-kernel

>> The point of this code is to first validate that the region we're working
>> on is valid. ?As soon as we drop the lock, we lose the guarantee that
>> the region is valid while we operate on it - indeed, the region could be
>> unmapped and remapped by a different thread.
>
> So what if it was remapped? The worst case scenario in this case is
> that we needlessly flush a region of memory, but its not "wrong". It
> can't be any worse than just doing a full cache flush. If another
> thread unmapped the region, then we should (and will) segfault the
> flushing thread, which is the correct behavior IMHO.

ping.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] [ARM] Do not call flush_cache_user_range with mmap_sem held
  2010-05-04  4:07         ` Dima Zavin
@ 2010-05-04  7:40           ` Russell King - ARM Linux
  0 siblings, 0 replies; 14+ messages in thread
From: Russell King - ARM Linux @ 2010-05-04  7:40 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, May 03, 2010 at 09:07:03PM -0700, Dima Zavin wrote:
> >> The point of this code is to first validate that the region we're working
> >> on is valid. ?As soon as we drop the lock, we lose the guarantee that
> >> the region is valid while we operate on it - indeed, the region could be
> >> unmapped and remapped by a different thread.
> >
> > So what if it was remapped? The worst case scenario in this case is
> > that we needlessly flush a region of memory, but its not "wrong". It
> > can't be any worse than just doing a full cache flush. If another
> > thread unmapped the region, then we should (and will) segfault the
> > flushing thread, which is the correct behavior IMHO.
> 
> ping.

I still do not think this is a good idea for the reasons I've already
mentioned.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] [ARM] Do not call flush_cache_user_range with mmap_semheld
  2010-04-29 18:24     ` Russell King - ARM Linux
  2010-04-29 19:23       ` Dima Zavin
@ 2010-05-06 15:00       ` Catalin Marinas
  2010-05-06 16:01         ` Jamie Lokier
  1 sibling, 1 reply; 14+ messages in thread
From: Catalin Marinas @ 2010-05-06 15:00 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 2010-04-29 at 19:24 +0100, Russell King - ARM Linux wrote:
> On Thu, Apr 29, 2010 at 07:16:36PM +0100, Jamie Lokier wrote:
> > Russell King - ARM Linux wrote:
> > > On Wed, Apr 28, 2010 at 12:32:11AM -0700, Dima Zavin wrote:
> > > > We can't be holding the mmap_sem while calling flush_cache_user_range
> > > > because the flush can fault. If we fault on a user address, the
> > > > page fault handler will try to take mmap_sem again. Since both places
> > > > acquire the read lock, most of the time it succeeds. However, if another
> > > > thread tries to acquire the write lock on the mmap_sem (e.g. mmap) in
> > > > between the call to flush_cache_user_range and the fault, the down_read
> > > > in do_page_fault will deadlock.
> > >
> > > That's a non-issue.  If do_cache_op is holding a read lock, _nothing_
> > > _else_ can be holding that lock in write mode.  So, holding the lock in
> > > read mode ensures that when faults occur, the fault handler can be sure
> > > that its read lock will succeed.
> >
> > read-write locks will block a reader when there is a already blocked
> > writer.  Otherwise the writer can be permanently starved due to new
> > readers always arriving so reader count doesn't reach zero.
> 
> Hmm, true, and rather unfortunate.
> 
> As I've already said, we can not do this cache maintainence outside of
> the lock.
> 
> The point of this code is to first validate that the region we're working
> on is valid.  As soon as we drop the lock, we lose the guarantee that
> the region is valid while we operate on it - indeed, the region could be
> unmapped and remapped by a different thread.

The flush_cache_user_range operation cannot actually damage the data. If
the application is so badly written that one of its threads remaps a
page range while another thread writes to it and flushes the caches,
then it deserves the memory corruption.

Personally, I would go even further and remove the find_vma() call (of
course with an access_ok() call to make sure the address isn't a kernel
one). I actually did some tests but the performance improvement was too
significant to be worth arguing the case on the list. But the app I was
using was a simple test where the vma tree was small. Things may be
different for a fully featured Java VM for example.

-- 
Catalin

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] [ARM] Do not call flush_cache_user_range with mmap_sem held
  2010-04-28  7:32 [PATCH] [ARM] Do not call flush_cache_user_range with mmap_sem held Dima Zavin
  2010-04-28  7:35 ` Dima Zavin
  2010-04-29 13:00 ` Russell King - ARM Linux
@ 2010-05-06 15:08 ` Catalin Marinas
  2 siblings, 0 replies; 14+ messages in thread
From: Catalin Marinas @ 2010-05-06 15:08 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 2010-04-28 at 08:32 +0100, Dima Zavin wrote:
> We can't be holding the mmap_sem while calling flush_cache_user_range
> because the flush can fault. If we fault on a user address, the
> page fault handler will try to take mmap_sem again. Since both places
> acquire the read lock, most of the time it succeeds. However, if another
> thread tries to acquire the write lock on the mmap_sem (e.g. mmap) in
> between the call to flush_cache_user_range and the fault, the down_read
> in do_page_fault will deadlock.
> 
> Also, since we really can't be holding the mmap_sem while calling
> flush_cache_user_range AND vma is actually unused by the flush itself,
> get rid of vma as an argument.
> 
> Signed-off-by: Dima Zavin <dima@android.com>
> Cc: Russell King <rmk+kernel@arm.linux.org.uk>
> Cc: Arve Hj?nnev?g <arve@android.com>

FWIW (and since I added the USER() macros to the *_coherent_user_range()
functions - commit 32cfb1b1), my view is that this is a safe patch (of
course, Russell has to agree with this as well):

Acked-by: Catalin Marinas <catalin.marinas@arm.com>

-- 
Catalin

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] [ARM] Do not call flush_cache_user_range with mmap_semheld
  2010-05-06 15:00       ` [PATCH] [ARM] Do not call flush_cache_user_range with mmap_semheld Catalin Marinas
@ 2010-05-06 16:01         ` Jamie Lokier
  2010-05-06 16:07           ` Jamie Lokier
  2010-05-06 16:21           ` Catalin Marinas
  0 siblings, 2 replies; 14+ messages in thread
From: Jamie Lokier @ 2010-05-06 16:01 UTC (permalink / raw)
  To: linux-arm-kernel

Catalin Marinas wrote:
> The flush_cache_user_range operation cannot actually damage the data. If
> the application is so badly written that one of its threads remaps a
> page range while another thread writes to it and flushes the caches,
> then it deserves the memory corruption.

It may deserve corruption, but doing corruption silently is cruel.

Moreover, calling mprotect(PROT_READ) in one thread while another
thread is writing to the same regions is a valid, and used, garbage
collector dirty-tracking technique.  (Page faults provide the info,
and the fault handler uses PROT_WRITE to let the faulting thread
continue on each tracked page.)

Is it possible to percolate EFAULT to the right places when the cache
flush faults?

> Personally, I would go even further and remove the find_vma() call (of
> course with an access_ok() call to make sure the address isn't a kernel
> one). I actually did some tests but the performance improvement was too
> significant to be worth arguing the case on the list. But the app I was
> using was a simple test where the vma tree was small. Things may be
> different for a fully featured Java VM for example.

Seems a reasonable thing to do, and a fully-featured use-lots-of-VMAs
app sounds like the sort of app which wants to flush part of caches
quickly.

-- Jamie

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] [ARM] Do not call flush_cache_user_range with mmap_semheld
  2010-05-06 16:01         ` Jamie Lokier
@ 2010-05-06 16:07           ` Jamie Lokier
  2010-05-06 16:24             ` Catalin Marinas
  2010-05-06 16:21           ` Catalin Marinas
  1 sibling, 1 reply; 14+ messages in thread
From: Jamie Lokier @ 2010-05-06 16:07 UTC (permalink / raw)
  To: linux-arm-kernel

Jamie Lokier wrote:
> Catalin Marinas wrote:
> > The flush_cache_user_range operation cannot actually damage the data. If
> > the application is so badly written that one of its threads remaps a
> > page range while another thread writes to it and flushes the caches,
> > then it deserves the memory corruption.
> 
> It may deserve corruption, but doing corruption silently is cruel.
> 
> Moreover, calling mprotect(PROT_READ) in one thread while another
> thread is writing to the same regions is a valid, and used, garbage
> collector dirty-tracking technique.  (Page faults provide the info,
> and the fault handler uses PROT_WRITE to let the faulting thread
> continue on each tracked page.)
> 
> Is it possible to percolate EFAULT to the right places when the cache
> flush faults?

Scratch that idea.  How about just doing a full mm
(address-independent) cache flush if a fault occurs?

Performance doesn't matter (it's not a normal situation), and it's
better than corruption (no week-long debugging session surprise).

-- Jamie

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] [ARM] Do not call flush_cache_user_range with mmap_semheld
  2010-05-06 16:01         ` Jamie Lokier
  2010-05-06 16:07           ` Jamie Lokier
@ 2010-05-06 16:21           ` Catalin Marinas
  1 sibling, 0 replies; 14+ messages in thread
From: Catalin Marinas @ 2010-05-06 16:21 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 2010-05-06 at 17:01 +0100, Jamie Lokier wrote:
> Catalin Marinas wrote:
> > The flush_cache_user_range operation cannot actually damage the data. If
> > the application is so badly written that one of its threads remaps a
> > page range while another thread writes to it and flushes the caches,
> > then it deserves the memory corruption.
> 
> It may deserve corruption, but doing corruption silently is cruel.
> 
> Moreover, calling mprotect(PROT_READ) in one thread while another
> thread is writing to the same regions is a valid, and used, garbage
> collector dirty-tracking technique.  (Page faults provide the info,
> and the fault handler uses PROT_WRITE to let the faulting thread
> continue on each tracked page.)

For this kind of situation, mprotect() could make the page not
accessible for a brief period of time (but with the mm semaphore held).
A coherent_user_range call would fault and do_page_fault() handles it
correctly by restarting the cache line operation once the PTE is
reinstated. So I don't think the app would miss any cache flushing
operation.

> Is it possible to percolate EFAULT to the right places when the cache
> flush faults?

If the cache flush faults because the vma is no longer there (though a
few cycles before there was one), the kernel calls __do_kernel_fault()
which tries to fix up the exception by calling the 9001 label in
cache-v6/v7.S. This label just tries the following page ignoring the
cache flushing operation.

That's a place where we can return -EFAULT from
v6/v7_coherent_user_range and change do_cache_op() to make use of this
value.

But that's a special case when one thread of an application unmaps an
address range while another thread writes to it.

> > Personally, I would go even further and remove the find_vma() call (of
> > course with an access_ok() call to make sure the address isn't a kernel
> > one). I actually did some tests but the performance improvement was too

BTW, I meant "was not too" above.

> > significant to be worth arguing the case on the list. But the app I was
> > using was a simple test where the vma tree was small. Things may be
> > different for a fully featured Java VM for example.
> 
> Seems a reasonable thing to do, and a fully-featured use-lots-of-VMAs
> app sounds like the sort of app which wants to flush part of caches
> quickly.

-- 
Catalin

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] [ARM] Do not call flush_cache_user_range with mmap_semheld
  2010-05-06 16:07           ` Jamie Lokier
@ 2010-05-06 16:24             ` Catalin Marinas
  0 siblings, 0 replies; 14+ messages in thread
From: Catalin Marinas @ 2010-05-06 16:24 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 2010-05-06 at 17:07 +0100, Jamie Lokier wrote:
> Jamie Lokier wrote:
> > Catalin Marinas wrote:
> > > The flush_cache_user_range operation cannot actually damage the data. If
> > > the application is so badly written that one of its threads remaps a
> > > page range while another thread writes to it and flushes the caches,
> > > then it deserves the memory corruption.
> >
> > It may deserve corruption, but doing corruption silently is cruel.
> >
> > Moreover, calling mprotect(PROT_READ) in one thread while another
> > thread is writing to the same regions is a valid, and used, garbage
> > collector dirty-tracking technique.  (Page faults provide the info,
> > and the fault handler uses PROT_WRITE to let the faulting thread
> > continue on each tracked page.)
> >
> > Is it possible to percolate EFAULT to the right places when the cache
> > flush faults?
> 
> Scratch that idea.  How about just doing a full mm
> (address-independent) cache flush if a fault occurs?
> 
> Performance doesn't matter (it's not a normal situation), and it's
> better than corruption (no week-long debugging session surprise).

I already replied on this specific case. I don't think we get any
corruption with the current implementation (and coherent_user_range
called outside the mm semaphore).

What if we had user-accessible cache flushing operations (well, a subset
of them)? We wouldn't have been able to take any semaphore and I'm sure
JIT people would have made use of them.

-- 
Catalin

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2010-05-06 16:24 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-04-28  7:32 [PATCH] [ARM] Do not call flush_cache_user_range with mmap_sem held Dima Zavin
2010-04-28  7:35 ` Dima Zavin
2010-04-29 13:00 ` Russell King - ARM Linux
2010-04-29 18:16   ` Jamie Lokier
2010-04-29 18:24     ` Russell King - ARM Linux
2010-04-29 19:23       ` Dima Zavin
2010-05-04  4:07         ` Dima Zavin
2010-05-04  7:40           ` Russell King - ARM Linux
2010-05-06 15:00       ` [PATCH] [ARM] Do not call flush_cache_user_range with mmap_semheld Catalin Marinas
2010-05-06 16:01         ` Jamie Lokier
2010-05-06 16:07           ` Jamie Lokier
2010-05-06 16:24             ` Catalin Marinas
2010-05-06 16:21           ` Catalin Marinas
2010-05-06 15:08 ` [PATCH] [ARM] Do not call flush_cache_user_range with mmap_sem held Catalin Marinas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).