potential deadlock in r4k_flush_cache

Linux MIPS Architecture development
 help / color / mirror / Atom feed

* potential deadlock in r4k_flush_cache_sigtramp()
@ 2017-09-23 13:40 Al Viro
  2017-09-26 13:59 ` James Hogan
  0 siblings, 1 reply; 3+ messages in thread
From: Al Viro @ 2017-09-23 13:40 UTC (permalink / raw)
  To: linux-mips; +Cc: James Hogan, Ralf Baechle

	Calling get_user_pages_fast() while holding ->mmap_sem is
asking for trouble:

CPU1: r4k_flush_cache_sigtramp()
	down_read(&current->mm->mmap_sem);

CPU2: (running thread with the same ->mm): sys_pkey_alloc()
	down_write(&current->mm->mmap_sem);

CPU1:
	pages = get_user_pages_fast(addr, 1, 0, &args.page);
which hits an absent page and goto slow.  Then it goes on to
        ret = get_user_pages_unlocked(start, (end - start) >> PAGE_SHIFT,
                                      pages, write ? FOLL_WRITE : 0);
which does
        return __get_user_pages_unlocked(current, current->mm, start, nr_pages,
                                         pages, gup_flags | FOLL_TOUCH);
which does
        down_read(&mm->mmap_sem);
        ret = __get_user_pages_locked(tsk, mm, start, nr_pages, pages, NULL, 
                                      &locked, false, gup_flags);

and we have a classical deadlock on recursive down_read() (thread 1: down_read()
gets the rwsem; thread 2: down_write() blocks waiting for thread 1 to release
it; thread 1: down_read() blocks waiting for thread 2 to get through down_write()
and eventual up_write(), which completes the deadlock).

Replacing pkey_alloc(2) with e.g. mmap(2) turns that from hard deadlock into
something killable, but while "with bad timing you might get process stuck
hard" is worse than "with bad timing you might get process stuck until you
kill -9 it", neither is a good thing.

I'm not familiar enough with arch/mips guts to suggest any variant of solution;
replacing get_user_pages_fast() with get_user_pages_locked() would solve the
deadlock, but that loses the fast path; not taking ->mmap_sem there have
local_r4k_flush_cache_sigtramp() run without fcs_args->mm being locked, which
might or might not be a problem.  Suggestions?

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: potential deadlock in r4k_flush_cache_sigtramp()
  2017-09-23 13:40 potential deadlock in r4k_flush_cache_sigtramp() Al Viro
@ 2017-09-26 13:59 ` James Hogan
  2017-09-26 13:59   ` James Hogan
  0 siblings, 1 reply; 3+ messages in thread
From: James Hogan @ 2017-09-26 13:59 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-mips, Ralf Baechle

[-- Attachment #1: Type: text/plain, Size: 3100 bytes --]

Hi Al,

On Sat, Sep 23, 2017 at 02:40:21PM +0100, Al Viro wrote:
> 	Calling get_user_pages_fast() while holding ->mmap_sem is
> asking for trouble:
> 
> CPU1: r4k_flush_cache_sigtramp()
> 	down_read(&current->mm->mmap_sem);
> 
> CPU2: (running thread with the same ->mm): sys_pkey_alloc()
> 	down_write(&current->mm->mmap_sem);
> 
> CPU1:
> 	pages = get_user_pages_fast(addr, 1, 0, &args.page);
> which hits an absent page and goto slow.  Then it goes on to
>         ret = get_user_pages_unlocked(start, (end - start) >> PAGE_SHIFT,
>                                       pages, write ? FOLL_WRITE : 0);
> which does
>         return __get_user_pages_unlocked(current, current->mm, start, nr_pages,
>                                          pages, gup_flags | FOLL_TOUCH);
> which does
>         down_read(&mm->mmap_sem);
>         ret = __get_user_pages_locked(tsk, mm, start, nr_pages, pages, NULL, 
>                                       &locked, false, gup_flags);
> 
> and we have a classical deadlock on recursive down_read() (thread 1: down_read()
> gets the rwsem; thread 2: down_write() blocks waiting for thread 1 to release
> it; thread 1: down_read() blocks waiting for thread 2 to get through down_write()
> and eventual up_write(), which completes the deadlock).

Hmm, indeed. Thanks for spotting that one.

> Replacing pkey_alloc(2) with e.g. mmap(2) turns that from hard deadlock into
> something killable, but while "with bad timing you might get process stuck
> hard" is worse than "with bad timing you might get process stuck until you
> kill -9 it", neither is a good thing.
> 
> I'm not familiar enough with arch/mips guts to suggest any variant of solution;
> replacing get_user_pages_fast() with get_user_pages_locked() would solve the
> deadlock, but that loses the fast path; not taking ->mmap_sem there have
> local_r4k_flush_cache_sigtramp() run without fcs_args->mm being locked, which
> might or might not be a problem.  Suggestions?

I suspect my concern was the theoretical situation where:

task1 hits an FPU branch instruction that needs emulating
	this creates a trampoline above the stack in user memory to
	execute the delay slot instruction, and calls
	r4k_flush_cache_sigtramp()

task2 in same process changes the memory mapping somehow.

SMP call to local_r4k_flush_cache_sigtramp()
	it may do cache op using the user virtual address when same mm
	active, or using kmap otherwise.

For the kmap case (SMP call to CPU with different process executing) its
holding a reference to the page so it should be fine. If the page was
swapped out or otherwise copied and it icache flushed the old page, then
I think the new page would get flushed when swapped back in so thats
fine.

Otherwise I can't imagine a situation where it'd misbehave unless
userland explicitly remapped the memory and it accessed using the
virtual address, in which case its asking for trouble.

Therefore I suspect it should be safe to drop the mmap_sem, but thats
OTOH. Have I missed anything important?

Thanks
James

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: potential deadlock in r4k_flush_cache_sigtramp()
  2017-09-26 13:59 ` James Hogan
@ 2017-09-26 13:59   ` James Hogan
  0 siblings, 0 replies; 3+ messages in thread
From: James Hogan @ 2017-09-26 13:59 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-mips, Ralf Baechle

[-- Attachment #1: Type: text/plain, Size: 3100 bytes --]

Hi Al,

On Sat, Sep 23, 2017 at 02:40:21PM +0100, Al Viro wrote:
> 	Calling get_user_pages_fast() while holding ->mmap_sem is
> asking for trouble:
> 
> CPU1: r4k_flush_cache_sigtramp()
> 	down_read(&current->mm->mmap_sem);
> 
> CPU2: (running thread with the same ->mm): sys_pkey_alloc()
> 	down_write(&current->mm->mmap_sem);
> 
> CPU1:
> 	pages = get_user_pages_fast(addr, 1, 0, &args.page);
> which hits an absent page and goto slow.  Then it goes on to
>         ret = get_user_pages_unlocked(start, (end - start) >> PAGE_SHIFT,
>                                       pages, write ? FOLL_WRITE : 0);
> which does
>         return __get_user_pages_unlocked(current, current->mm, start, nr_pages,
>                                          pages, gup_flags | FOLL_TOUCH);
> which does
>         down_read(&mm->mmap_sem);
>         ret = __get_user_pages_locked(tsk, mm, start, nr_pages, pages, NULL, 
>                                       &locked, false, gup_flags);
> 
> and we have a classical deadlock on recursive down_read() (thread 1: down_read()
> gets the rwsem; thread 2: down_write() blocks waiting for thread 1 to release
> it; thread 1: down_read() blocks waiting for thread 2 to get through down_write()
> and eventual up_write(), which completes the deadlock).

Hmm, indeed. Thanks for spotting that one.

> Replacing pkey_alloc(2) with e.g. mmap(2) turns that from hard deadlock into
> something killable, but while "with bad timing you might get process stuck
> hard" is worse than "with bad timing you might get process stuck until you
> kill -9 it", neither is a good thing.
> 
> I'm not familiar enough with arch/mips guts to suggest any variant of solution;
> replacing get_user_pages_fast() with get_user_pages_locked() would solve the
> deadlock, but that loses the fast path; not taking ->mmap_sem there have
> local_r4k_flush_cache_sigtramp() run without fcs_args->mm being locked, which
> might or might not be a problem.  Suggestions?

I suspect my concern was the theoretical situation where:

task1 hits an FPU branch instruction that needs emulating
	this creates a trampoline above the stack in user memory to
	execute the delay slot instruction, and calls
	r4k_flush_cache_sigtramp()

task2 in same process changes the memory mapping somehow.

SMP call to local_r4k_flush_cache_sigtramp()
	it may do cache op using the user virtual address when same mm
	active, or using kmap otherwise.

For the kmap case (SMP call to CPU with different process executing) its
holding a reference to the page so it should be fine. If the page was
swapped out or otherwise copied and it icache flushed the old page, then
I think the new page would get flushed when swapped back in so thats
fine.

Otherwise I can't imagine a situation where it'd misbehave unless
userland explicitly remapped the memory and it accessed using the
virtual address, in which case its asking for trouble.

Therefore I suspect it should be safe to drop the mmap_sem, but thats
OTOH. Have I missed anything important?

Thanks
James

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2017-09-26 13:59 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-09-23 13:40 potential deadlock in r4k_flush_cache_sigtramp() Al Viro
2017-09-26 13:59 ` James Hogan
2017-09-26 13:59   ` James Hogan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox