From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Boqun Feng <boqun.feng@gmail.com>
Cc: "Paolo Bonzini" <pbonzini@redhat.com>,
linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
"Peter Zijlstra" <peterz@infradead.org>,
"Radim Krčmář" <rkrcmar@redhat.com>,
"Thomas Gleixner" <tglx@linutronix.de>,
"Ingo Molnar" <mingo@redhat.com>,
"H. Peter Anvin" <hpa@zytor.com>,
x86@kernel.org
Subject: Re: [PATCH] kvm/x86: Handle async PF in RCU read-side critical sections
Date: Sat, 30 Sep 2017 10:15:15 -0700 [thread overview]
Message-ID: <20170930171515.GK3521@linux.vnet.ibm.com> (raw)
In-Reply-To: <20170929234156.5e5oucxsqhxoqcml@tardis>
On Sat, Sep 30, 2017 at 07:41:56AM +0800, Boqun Feng wrote:
> On Fri, Sep 29, 2017 at 04:43:39PM +0000, Paul E. McKenney wrote:
> > On Fri, Sep 29, 2017 at 04:53:57PM +0200, Paolo Bonzini wrote:
> > > On 29/09/2017 13:01, Boqun Feng wrote:
> > > > Sasha Levin reported a WARNING:
> > > >
> > > > | WARNING: CPU: 0 PID: 6974 at kernel/rcu/tree_plugin.h:329
> > > > | rcu_preempt_note_context_switch kernel/rcu/tree_plugin.h:329 [inline]
> > > > | WARNING: CPU: 0 PID: 6974 at kernel/rcu/tree_plugin.h:329
> > > > | rcu_note_context_switch+0x16c/0x2210 kernel/rcu/tree.c:458
> > > > ...
> > > > | CPU: 0 PID: 6974 Comm: syz-fuzzer Not tainted 4.13.0-next-20170908+ #246
> > > > | Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> > > > | 1.10.1-1ubuntu1 04/01/2014
> > > > | Call Trace:
> > > > ...
> > > > | RIP: 0010:rcu_preempt_note_context_switch kernel/rcu/tree_plugin.h:329 [inline]
> > > > | RIP: 0010:rcu_note_context_switch+0x16c/0x2210 kernel/rcu/tree.c:458
> > > > | RSP: 0018:ffff88003b2debc8 EFLAGS: 00010002
> > > > | RAX: 0000000000000001 RBX: 1ffff1000765bd85 RCX: 0000000000000000
> > > > | RDX: 1ffff100075d7882 RSI: ffffffffb5c7da20 RDI: ffff88003aebc410
> > > > | RBP: ffff88003b2def30 R08: dffffc0000000000 R09: 0000000000000001
> > > > | R10: 0000000000000000 R11: 0000000000000000 R12: ffff88003b2def08
> > > > | R13: 0000000000000000 R14: ffff88003aebc040 R15: ffff88003aebc040
> > > > | __schedule+0x201/0x2240 kernel/sched/core.c:3292
> > > > | schedule+0x113/0x460 kernel/sched/core.c:3421
> > > > | kvm_async_pf_task_wait+0x43f/0x940 arch/x86/kernel/kvm.c:158
> > > > | do_async_page_fault+0x72/0x90 arch/x86/kernel/kvm.c:271
> > > > | async_page_fault+0x22/0x30 arch/x86/entry/entry_64.S:1069
> > > > | RIP: 0010:format_decode+0x240/0x830 lib/vsprintf.c:1996
> > > > | RSP: 0018:ffff88003b2df520 EFLAGS: 00010283
> > > > | RAX: 000000000000003f RBX: ffffffffb5d1e141 RCX: ffff88003b2df670
> > > > | RDX: 0000000000000001 RSI: dffffc0000000000 RDI: ffffffffb5d1e140
> > > > | RBP: ffff88003b2df560 R08: dffffc0000000000 R09: 0000000000000000
> > > > | R10: ffff88003b2df718 R11: 0000000000000000 R12: ffff88003b2df5d8
> > > > | R13: 0000000000000064 R14: ffffffffb5d1e140 R15: 0000000000000000
> > > > | vsnprintf+0x173/0x1700 lib/vsprintf.c:2136
> > > > | sprintf+0xbe/0xf0 lib/vsprintf.c:2386
> > > > | proc_self_get_link+0xfb/0x1c0 fs/proc/self.c:23
> > > > | get_link fs/namei.c:1047 [inline]
> > > > | link_path_walk+0x1041/0x1490 fs/namei.c:2127
> > > > ...
> > > >
> > > > And this happened when we hit a page fault in an RCU read-side critical
> > > > section and then we tried to reschedule in kvm_async_pf_task_wait(),
> > > > this reschedule would hit the WARN in rcu_preempt_note_context_switch(),
> > > > and be treated as a sleep in RCU read-side critical section, which is
> > > > not allowed(even in preemptible RCU).
> > >
> > > Just a small fix to the commit message:
> > >
> > > This happened when the host hit a page fault, and delivered it as in an
> > > async page fault, while the guest was in an RCU read-side critical
> > > section. The guest then tries to reschedule in kvm_async_pf_task_wait(),
> > > but rcu_preempt_note_context_switch() would treat the reschedule as a
> > > sleep in RCU read-side critical section, which is not allowed (even in
> > > preemptible RCU). Thus the WARN.
> > >
> > > Queued with that change, thanks.
> >
> > Not to be repetitive, but if the schedule() is on the guest, this change
> > really does silently break up an RCU read-side critical section on
> > guests built with PREEMPT=n. (Yes, they were already being broken,
> > but it would be good to avoid this breakage in PREEMPT=n as well as
> > in PREEMPT=y.)
> >
>
> Then probably adding !IS_ENABLED(CONFIG_PREEMPT) as one of the reason we
> choose the halt path? Like:
>
> n.halted = is_idle_task(current) || preempt_count() > 1 ||
> !IS_ENABLED(CONFIG_PREEMPT) || rcu_preempt_depth();
>
>
> But I think async PF could also happen while a user program is running?
> Then maybe add a second parameter @user for kvm_async_pf_task_wait(),
> like:
>
> kvm_async_pf_task_wait((u32)read_cr2(), user_mode(regs));
>
> and the halt condition becomes:
>
> n.halted = is_idle_task(current) || preempt_count() > 1 ||
> (!IS_ENABLED(CONFIG_PREEMPT) && !user) || rcu_preempt_depth();
>
> Thoughts?
This looks to me like it would cover it. If !PREEMPT interrupt from
kernel, we halt, which would prevent the sleep.
I take it that we get unhalted when the host gets things patched up?
> A side thing is being broken already for PREEMPT=n means we maybe fail
> to detect this in rcutorture? Then should we add a config with
> KVM_GUEST=y and try to run some memory consuming things(e.g. stress
> --vm) in the rcutorture kvm script simultaneously? Paolo, do you have
> any test workload that could trigger async PF quickly?
I do not believe that have seen this in rcutorture, but I always run in
a guest OS on a large-memory system (well, by my old-fashioned standards,
anyway) that would be quite unlikely to evict a guest OS's pages. Plus
I tend to run on shared systems, and deliberately running them out of
memory would not be particularly friendly to others using those systems.
I -do- run background scripts that are intended to force the host OS to
preempt the guest OSes frequently, but I don't believe that this would
cause that bug.
But it seems like it would make more sense to add this sort of thing to
whatever KVM tests there are for host-side eviction of guest pages.
Thanx, Paul
next prev parent reply other threads:[~2017-09-30 17:15 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-09-29 11:01 [PATCH] kvm/x86: Handle async PF in RCU read-side critical sections Boqun Feng
2017-09-29 14:53 ` Paolo Bonzini
2017-09-29 16:43 ` Paul E. McKenney
2017-09-29 23:41 ` Boqun Feng
2017-09-30 17:15 ` Paul E. McKenney [this message]
2017-10-01 1:20 ` Boqun Feng
2017-10-02 12:45 ` Paolo Bonzini
2017-10-02 13:53 ` Paul E. McKenney
2017-10-01 1:31 ` [PATCH v2] " Boqun Feng
2017-10-02 13:41 ` Paolo Bonzini
2017-10-02 14:43 ` Boqun Feng
2017-10-02 15:34 ` Paul E. McKenney
2017-10-02 21:11 ` Paolo Bonzini
2017-10-02 21:41 ` Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170930171515.GK3521@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=boqun.feng@gmail.com \
--cc=hpa@zytor.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peterz@infradead.org \
--cc=rkrcmar@redhat.com \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.