linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3-resend 00/11] uaccess: better might_sleep/might_fault behavior
@ 2013-05-26 14:21 Michael S. Tsirkin
  2013-05-26 14:32 ` [PATCH v3-resend 10/11] kernel: drop voluntary schedule from might_fault Michael S. Tsirkin
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Michael S. Tsirkin @ 2013-05-26 14:21 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Peter Zijlstra, Arnd Bergmann, linux-arch, linux-mm,
	kvm

I seem to have mis-sent v3.  Trying again with same patches after
fixing the message id for the cover letter. I hope the duplicates
that are thus created don't inconvenience people too much.
If they do, I apologize.
I have pared down the Cc list to reduce the noise.
sched maintainers are Cc'd on all patches since that's
the tree I aim for with these patches.

This improves the might_fault annotations used
by uaccess routines:

1. The only reason uaccess routines might sleep
   is if they fault. Make this explicit for
   all architectures.
2. a voluntary preempt point in uaccess functions
   means compiler can't inline them efficiently,
   this breaks assumptions that they are very
   fast and small that e.g. net code seems to make.
   remove this preempt point so behaviour
   matches what callers assume.
3. Accesses (e.g through socket ops) to kernel memory
   with KERNEL_DS like net/sunrpc does will never sleep.
   Remove an unconditinal might_sleep in the inline
   might_fault in kernel.h
   (used when PROVE_LOCKING is not set).
4. Accesses with pagefault_disable return EFAULT
   but won't cause caller to sleep.
   Check for that and avoid might_sleep when
   PROVE_LOCKING is set.

I'd like these changes to go in for 3.11:
besides a general benefit of improved
consistency and performance, I would also like them
for the vhost driver where we want to call socket ops
under a spinlock, and fall back on slower thread handler
on error.

If the changes look good, would sched maintainers
please consider merging them through sched/core because of the
interaction with the scheduler?

Please review, and consider for 3.11.

Note on arch code updates:
I tested x86_64 code.
Other architectures were build-tested.
I don't have cross-build environment for arm64, tile, microblaze and
mn10300 architectures. arm64 and tile got acks.
The arch changes look generally safe enough
but would appreciate review/acks from arch maintainers.
core changes naturally need acks from sched maintainers.

Version 1 of this change was titled
	x86: uaccess s/might_sleep/might_fault/

Changes from v2:
	add a patch removing a colunatry preempt point
	in uaccess functions when PREEMPT_VOLUNATRY is set.
		Addresses comments by Arnd Bergmann,
		and Peter Zijlstra.
	comment on future possible simplifications in the git log
		for the powerpc patch. Addresses a comment
		by Arnd Bergmann.
	
Changes from v1:
	add more architectures
	fix might_fault() scheduling differently depending
	on CONFIG_PROVE_LOCKING, as suggested by Ingo

Michael S. Tsirkin (11):
  asm-generic: uaccess s/might_sleep/might_fault/
  arm64: uaccess s/might_sleep/might_fault/
  frv: uaccess s/might_sleep/might_fault/
  m32r: uaccess s/might_sleep/might_fault/
  microblaze: uaccess s/might_sleep/might_fault/
  mn10300: uaccess s/might_sleep/might_fault/
  powerpc: uaccess s/might_sleep/might_fault/
  tile: uaccess s/might_sleep/might_fault/
  x86: uaccess s/might_sleep/might_fault/
  kernel: drop voluntary schedule from might_fault
  kernel: uaccess in atomic with pagefault_disable

 arch/arm64/include/asm/uaccess.h      |  4 ++--
 arch/frv/include/asm/uaccess.h        |  4 ++--
 arch/m32r/include/asm/uaccess.h       | 12 ++++++------
 arch/microblaze/include/asm/uaccess.h |  6 +++---
 arch/mn10300/include/asm/uaccess.h    |  4 ++--
 arch/powerpc/include/asm/uaccess.h    | 16 ++++++++--------
 arch/tile/include/asm/uaccess.h       |  2 +-
 arch/x86/include/asm/uaccess_64.h     |  2 +-
 include/asm-generic/uaccess.h         | 10 +++++-----
 include/linux/kernel.h                |  7 ++-----
 mm/memory.c                           | 10 +++++++---
 11 files changed, 39 insertions(+), 38 deletions(-)

-- 
MST

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH v3-resend 10/11] kernel: drop voluntary schedule from might_fault
  2013-05-26 14:21 [PATCH v3-resend 00/11] uaccess: better might_sleep/might_fault behavior Michael S. Tsirkin
@ 2013-05-26 14:32 ` Michael S. Tsirkin
  2013-05-26 14:32 ` [PATCH v3-resend 11/11] kernel: uaccess in atomic with pagefault_disable Michael S. Tsirkin
  2013-05-27 16:35 ` [PATCH v3-resend 00/11] uaccess: better might_sleep/might_fault behavior Peter Zijlstra
  2 siblings, 0 replies; 4+ messages in thread
From: Michael S. Tsirkin @ 2013-05-26 14:32 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Peter Zijlstra, Arnd Bergmann, linux-mm

might_fault is called from functions like copy_to_user
which most callers expect to be very fast, like
a couple of instructions.  So functions like memcpy_toiovec call them
many times in a loop.
But might_fault calls might_sleep() and with CONFIG_PREEMPT_VOLUNTARY
this results in a function call.

Let's not do this - just call __might_sleep that produces
a diagnostic for sleep within atomic, but drop
might_preempt().

Here's a test sending traffic between the VM and the host,
host is built with CONFIG_PREEMPT_VOLUNTARY:
Before:
	incoming: 7122.77   Mb/s
	outgoing: 8480.37   Mb/s
after:
	incoming: 8619.24   Mb/s
	outgoing: 9455.42   Mb/s

As a side effect, this fixes an issue pointed
out by Ingo: might_fault might schedule differently
depending on PROVE_LOCKING. Now there's no
preemption point in both cases, so it's consistent.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 include/linux/kernel.h | 2 +-
 mm/memory.c            | 3 ++-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index e96329c..c514c06 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -198,7 +198,7 @@ void might_fault(void);
 #else
 static inline void might_fault(void)
 {
-	might_sleep();
+	__might_sleep(__FILE__, __LINE__, 0);
 }
 #endif
 
diff --git a/mm/memory.c b/mm/memory.c
index 6dc1882..c1f190f 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4222,7 +4222,8 @@ void might_fault(void)
 	if (segment_eq(get_fs(), KERNEL_DS))
 		return;
 
-	might_sleep();
+	__might_sleep(__FILE__, __LINE__, 0);
+
 	/*
 	 * it would be nicer only to annotate paths which are not under
 	 * pagefault_disable, however that requires a larger audit and
-- 
MST

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH v3-resend 11/11] kernel: uaccess in atomic with pagefault_disable
  2013-05-26 14:21 [PATCH v3-resend 00/11] uaccess: better might_sleep/might_fault behavior Michael S. Tsirkin
  2013-05-26 14:32 ` [PATCH v3-resend 10/11] kernel: drop voluntary schedule from might_fault Michael S. Tsirkin
@ 2013-05-26 14:32 ` Michael S. Tsirkin
  2013-05-27 16:35 ` [PATCH v3-resend 00/11] uaccess: better might_sleep/might_fault behavior Peter Zijlstra
  2 siblings, 0 replies; 4+ messages in thread
From: Michael S. Tsirkin @ 2013-05-26 14:32 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Peter Zijlstra, Arnd Bergmann, linux-mm

This changes might_fault so that it does not
trigger a false positive diagnostic for e.g. the following
sequence:
	spin_lock_irqsave
	pagefault_disable
	copy_to_user
	pagefault_enable
	spin_unlock_irqrestore

In particular vhost wants to do this, to call
socket ops from under a lock.

There are 3 cases to consider:
CONFIG_PROVE_LOCKING - might_fault is non-inline
so it's easy to move the in_atomic test to fix
up the false positive warning.

CONFIG_DEBUG_ATOMIC_SLEEP - might_fault
is currently inline, but we are calling a
non-inline __might_sleep anyway,
so let's use the non-line version of might_fault
that does the right thing.

!CONFIG_DEBUG_ATOMIC_SLEEP && !CONFIG_PROVE_LOCKING
__might_sleep is a nop so might_fault is a nop.
Make this explicit.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 include/linux/kernel.h |  7 ++-----
 mm/memory.c            | 11 +++++++----
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index c514c06..0153be1 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -193,13 +193,10 @@ extern int _cond_resched(void);
 		(__x < 0) ? -__x : __x;		\
 	})
 
-#ifdef CONFIG_PROVE_LOCKING
+#if defined(CONFIG_PROVE_LOCKING) || defined(CONFIG_DEBUG_ATOMIC_SLEEP)
 void might_fault(void);
 #else
-static inline void might_fault(void)
-{
-	__might_sleep(__FILE__, __LINE__, 0);
-}
+static inline void might_fault(void) { }
 #endif
 
 extern struct atomic_notifier_head panic_notifier_list;
diff --git a/mm/memory.c b/mm/memory.c
index c1f190f..d7d54a1 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4210,7 +4210,7 @@ void print_vma_addr(char *prefix, unsigned long ip)
 	up_read(&mm->mmap_sem);
 }
 
-#ifdef CONFIG_PROVE_LOCKING
+#if defined(CONFIG_PROVE_LOCKING) || defined(CONFIG_DEBUG_ATOMIC_SLEEP)
 void might_fault(void)
 {
 	/*
@@ -4222,14 +4222,17 @@ void might_fault(void)
 	if (segment_eq(get_fs(), KERNEL_DS))
 		return;
 
-	__might_sleep(__FILE__, __LINE__, 0);
-
 	/*
 	 * it would be nicer only to annotate paths which are not under
 	 * pagefault_disable, however that requires a larger audit and
 	 * providing helpers like get_user_atomic.
 	 */
-	if (!in_atomic() && current->mm)
+	if (in_atomic())
+		return;
+
+	__might_sleep(__FILE__, __LINE__, 0);
+
+	if (current->mm)
 		might_lock_read(&current->mm->mmap_sem);
 }
 EXPORT_SYMBOL(might_fault);
-- 
MST

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v3-resend 00/11] uaccess: better might_sleep/might_fault behavior
  2013-05-26 14:21 [PATCH v3-resend 00/11] uaccess: better might_sleep/might_fault behavior Michael S. Tsirkin
  2013-05-26 14:32 ` [PATCH v3-resend 10/11] kernel: drop voluntary schedule from might_fault Michael S. Tsirkin
  2013-05-26 14:32 ` [PATCH v3-resend 11/11] kernel: uaccess in atomic with pagefault_disable Michael S. Tsirkin
@ 2013-05-27 16:35 ` Peter Zijlstra
  2 siblings, 0 replies; 4+ messages in thread
From: Peter Zijlstra @ 2013-05-27 16:35 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, Ingo Molnar, Arnd Bergmann, linux-arch, linux-mm,
	kvm

On Sun, May 26, 2013 at 05:21:30PM +0300, Michael S. Tsirkin wrote:
> If the changes look good, would sched maintainers
> please consider merging them through sched/core because of the
> interaction with the scheduler?
> 
> Please review, and consider for 3.11.

I'll stick them in my queue, we'll see if anything falls over ;-)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2013-05-27 16:35 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-05-26 14:21 [PATCH v3-resend 00/11] uaccess: better might_sleep/might_fault behavior Michael S. Tsirkin
2013-05-26 14:32 ` [PATCH v3-resend 10/11] kernel: drop voluntary schedule from might_fault Michael S. Tsirkin
2013-05-26 14:32 ` [PATCH v3-resend 11/11] kernel: uaccess in atomic with pagefault_disable Michael S. Tsirkin
2013-05-27 16:35 ` [PATCH v3-resend 00/11] uaccess: better might_sleep/might_fault behavior Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).