linux-next.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH] Fix abnormal rcu dynticks_nesting values related to async page fault
@ 2012-11-27  5:15 Li Zhong
  2012-11-27  5:58 ` [PATCH rcu] use new nesting value for rcu_dyntick trace in rcu_eqs_enter_common Li Zhong
                   ` (5 more replies)
  0 siblings, 6 replies; 50+ messages in thread
From: Li Zhong @ 2012-11-27  5:15 UTC (permalink / raw)
  To: linux-next list, LKML; +Cc: paulmck, sasha.levin, gleb, avi, fweisbec

I noticed some warnings complaining about dynticks_nesting value, like 

[  267.545032] ------------[ cut here ]------------
[  267.545032] WARNING: at kernel/rcutree.c:382 rcu_eqs_enter+0xab/0xc0()
[  267.545032] Hardware name: Bochs
[  267.545032] Modules linked in:
[  267.545032] Pid: 0, comm: swapper/2 Not tainted 3.7.0-rc5-next-20121115 #8
[  267.545032] Call Trace:
[  267.545032]  [<ffffffff8104714f>] warn_slowpath_common+0x7f/0xc0
[  267.545032]  [<ffffffff810471aa>] warn_slowpath_null+0x1a/0x20
[  267.545032]  [<ffffffff810e607b>] rcu_eqs_enter+0xab/0xc0
[  267.545032]  [<ffffffff810e60bb>] rcu_idle_enter+0x2b/0x70
[  267.545032]  [<ffffffff8100d44f>] cpu_idle+0x6f/0x100
[  267.545032]  [<ffffffff814bf055>] start_secondary+0x205/0x20c
[  267.545032] ---[ end trace 924ae80da035028d ]---

After enabling rcu-dyntick tracing, I got following abnormal
dynticks_nesting values (13fffffffffffff, ff00000000000001,etc):
			...
 1	<idle>-0     [002] dN.2 18739.518567: rcu_dyntick: End 0 140000000000000		rcu_idle_exit
 2	  sshd-696   [002] d..1 18739.518675: rcu_dyntick: ++= 140000000000000 140000000000001	rcu_irq_enter   - apf (not present)

 3	<idle>-0     [002] d..2 18739.518705: rcu_dyntick: Start 140000000000001 0		rcu_idle_enter
 4	<idle>-0     [002] d..2 18739.521252: rcu_dyntick: End 0 1				rcu_irq_enter	- apf (page ready)
 5	<idle>-0     [002] dN.2 18739.521261: rcu_dyntick: Start 1 0				rcu_irq_exit	- apf (page ready)
 6	<idle>-0     [002] dN.2 18739.521263: rcu_dyntick: End 0 140000000000000		rcu_idle_exit

 7	  sshd-696   [002] d..1 18739.521299: rcu_dyntick: --= 140000000000000 13fffffffffffff	rcu_irq_exit	- apf (not present)
 8	  sshd-696   [002] d..1 18739.521302: rcu_dyntick: Start 13fffffffffffff 0		rcu_user_enter
 9	  sshd-696   [002] d..1 18739.521330: rcu_dyntick: End 0 1				rcu_irq_enter   - apf (not present)

10	<idle>-0     [002] d..2 18739.521346: rcu_dyntick: Start 1 0				rcu_idle_enter - old value 1, warning
11	<idle>-0     [002] d..2 18739.530021: rcu_dyntick: ++= ff00000000000001 ff00000000000002
12	<idle>-0     [002] dN.2 18739.530029: rcu_dyntick: --= ff00000000000002 ff00000000000001
		...

I added the functions I guess which printed the tracing after each
line. 

Line #1, the idle-0 process calls rcu_idle_exit(), and finishes one
loop, to switch to sshd-696

Line #2, sshd-696 calls rcu_irq_enter() because of async page fault(page
not present), and puts itself to wait for page ready

Line #3, idle-0 is switched in, and clears the dynticks_nesting to 0

Line #4-5, I think the rcu_irq_enter/exit() is called because the page
for sshd-696 is ready

Line #6, idle-0 calls rcu_idle_exit(), to switch to sshd-696

Line #7, sshd-696 calls rcu_irq_exit() in the apf (page not present)
code path, decreasing dynticks_nesting to 13fffffffffffff. 

Line #8-9, sshd-696 calls rcu_user_enter() to start user eqs, and gets
async page fault again. It puts itself sleep again, with
dynticks_nesting value as 1.

Line #10, idle-0 switches in, as the dynticks_nesting value is 1, so
warning is reported in rcu_idle_enter(), then the value is decreased to
ff00000000000001. (In the tracing log, the new value is 0, that's
because rcu hard-code the value to be 0. I will send another patch for
this.)

This patch below tries to replace the rcu_irq_enter/exit() with
rcu_idle_exit/enter(), if it is in rcu idle, and it is idle process;
otherwise, rcu_user_exit() is called to exit user eqs if necessary. 

Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
---
 arch/x86/kernel/kvm.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 4180a87..f65648d 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -247,10 +247,17 @@ do_async_page_fault(struct pt_regs *regs, unsigned long error_code)
 		break;
 	case KVM_PV_REASON_PAGE_NOT_PRESENT:
 		/* page is swapped out by the host. */
-		rcu_irq_enter();
+		if (is_idle_task(current) && rcu_is_cpu_idle())
+			rcu_idle_exit();
+		else
+			rcu_user_exit();
+
 		exit_idle();
 		kvm_async_pf_task_wait((u32)read_cr2());
-		rcu_irq_exit();
+
+		if (is_idle_task(current) && rcu_is_cpu_idle())
+			rcu_idle_enter();
+
 		break;
 	case KVM_PV_REASON_PAGE_READY:
 		rcu_irq_enter();
-- 
1.7.11.4

^ permalink raw reply related	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2012-12-18 13:25 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-11-27  5:15 [RFC PATCH] Fix abnormal rcu dynticks_nesting values related to async page fault Li Zhong
2012-11-27  5:58 ` [PATCH rcu] use new nesting value for rcu_dyntick trace in rcu_eqs_enter_common Li Zhong
2012-11-27 15:18   ` Paul E. McKenney
2012-11-27 13:07 ` [RFC PATCH] Fix abnormal rcu dynticks_nesting values related to async page fault Gleb Natapov
2012-11-27 14:01   ` Sasha Levin
2012-11-27 15:25     ` Paul E. McKenney
2012-11-27 15:34   ` Frederic Weisbecker
2012-11-27 14:38 ` Frederic Weisbecker
2012-11-27 15:44   ` Gleb Natapov
2012-11-27 15:56     ` Frederic Weisbecker
2012-11-27 16:19       ` Paul E. McKenney
2012-11-27 16:48         ` Frederic Weisbecker
2012-11-27 16:59           ` Paul E. McKenney
2012-11-27 16:39       ` Gleb Natapov
2012-11-27 16:51         ` Frederic Weisbecker
2012-11-27 17:00           ` Gleb Natapov
2012-11-27 17:30             ` Frederic Weisbecker
2012-11-27 17:47               ` Gleb Natapov
2012-11-27 18:12                 ` Frederic Weisbecker
2012-11-27 19:27                   ` Gleb Natapov
2012-11-27 22:53                     ` Frederic Weisbecker
2012-11-27 22:54                       ` Frederic Weisbecker
2012-11-27 15:39 ` Frederic Weisbecker
2012-11-27 16:16   ` Paul E. McKenney
2012-11-27 16:31     ` Frederic Weisbecker
2012-11-27 16:29 ` Frederic Weisbecker
2012-11-28  8:18   ` [RFC PATCH v2] Add rcu user eqs exception hooks for " Li Zhong
2012-11-28 12:55     ` Frederic Weisbecker
2012-11-28 13:53       ` Gleb Natapov
2012-11-28 14:25         ` Frederic Weisbecker
2012-11-29 11:07           ` Gleb Natapov
2012-11-29 14:47             ` Frederic Weisbecker
2012-11-30  9:18               ` [RFC PATCH v3] " Li Zhong
2012-11-30 10:26                 ` Gleb Natapov
2012-12-03  2:08                   ` Li Zhong
2012-12-03  8:30                     ` Gleb Natapov
2012-12-03  9:57                 ` Gleb Natapov
2012-12-04  2:35                   ` [ PATCH] " Li Zhong
2012-12-18 13:25                     ` Gleb Natapov
2012-12-04  2:36                   ` [RFC PATCH v3] " Li Zhong
2012-12-04  5:11                     ` Gleb Natapov
2012-12-04  5:40                       ` Li Zhong
2012-12-04 13:02                     ` Gleb Natapov
2012-12-04 14:28                       ` Paul E. McKenney
2012-11-29  1:49       ` [RFC PATCH v2] " Li Zhong
2012-11-29 14:40         ` Frederic Weisbecker
2012-11-29 17:25           ` Gleb Natapov
2012-11-30  8:36             ` Li Zhong
2012-11-30 10:08               ` Gleb Natapov
2012-11-27 16:43 ` [RFC PATCH] Fix abnormal rcu dynticks_nesting values related to " Frederic Weisbecker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).