All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Wanpeng Li <kernellwp@gmail.com>, Marc Zyngier <maz@kernel.org>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Wanpeng Li <wanpengli@tencent.com>,
	Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
	kvm <kvm@vger.kernel.org>, LKML <linux-kernel@vger.kernel.org>,
	Ben Gardon <bgardon@google.com>
Subject: Re: [PATCH v2 09/10] KVM: Don't take mmu_lock for range invalidation unless necessary
Date: Mon, 19 Apr 2021 15:09:01 +0000	[thread overview]
Message-ID: <YH2dDRBXJcbUcbLi@google.com> (raw)
In-Reply-To: <2a7670e4-94c0-9f35-74de-a7d5b1504ced@redhat.com>

On Mon, Apr 19, 2021, Paolo Bonzini wrote:
> On 19/04/21 10:49, Wanpeng Li wrote:
> > I saw this splatting:
> > 
> >   ======================================================
> >   WARNING: possible circular locking dependency detected
> >   5.12.0-rc3+ #6 Tainted: G           OE
> >   ------------------------------------------------------
> >   qemu-system-x86/3069 is trying to acquire lock:
> >   ffffffff9c775ca0 (mmu_notifier_invalidate_range_start){+.+.}-{0:0},
> > at: __mmu_notifier_invalidate_range_end+0x5/0x190
> > 
> >   but task is already holding lock:
> >   ffffaff7410a9160 (&kvm->mmu_notifier_slots_lock){.+.+}-{3:3}, at:
> > kvm_mmu_notifier_invalidate_range_start+0x36d/0x4f0 [kvm]
> 
> I guess it is possible to open-code the wait using a readers count and a
> spinlock (see patch after signature).  This allows including the
> rcu_assign_pointer in the same critical section that checks the number
> of readers.  Also on the plus side, the init_rwsem() is replaced by
> slightly nicer code.

Ugh, the count approach is nearly identical to Ben's original code.  Using a
rwsem seemed so clever :-/

> IIUC this could be extended to non-sleeping invalidations too, but I
> am not really sure about that.

Yes, that should be fine.

> There are some issues with the patch though:
> 
> - I am not sure if this should be a raw spin lock to avoid the same issue
> on PREEMPT_RT kernel.  That said the critical section is so tiny that using
> a raw spin lock may make sense anyway

If using spinlock_t is problematic, wouldn't mmu_lock already be an issue?  Or
am I misunderstanding your concern?

> - this loses the rwsem fairness.  On the other hand, mm/mmu_notifier.c's
> own interval-tree-based filter is also using a similar mechanism that is
> likewise not fair, so it should be okay.

The one concern I had with an unfair mechanism of this nature is that, in theory,
the memslot update could be blocked indefinitely.

> Any opinions?  For now I placed the change below in kvm/queue, but I'm
> leaning towards delaying this optimization to the next merge window.

I think delaying it makes sense.

> @@ -1333,9 +1351,22 @@ static struct kvm_memslots *install_new_memslots(struct kvm *kvm,
>  	WARN_ON(gen & KVM_MEMSLOT_GEN_UPDATE_IN_PROGRESS);
>  	slots->generation = gen | KVM_MEMSLOT_GEN_UPDATE_IN_PROGRESS;
> -	down_write(&kvm->mmu_notifier_slots_lock);
> +	/*
> +	 * This cannot be an rwsem because the MMU notifier must not run
> +	 * inside the critical section.  A sleeping rwsem cannot exclude
> +	 * that.

How on earth did you decipher that from the splat?  I stared at it for a good
five minutes and was completely befuddled.

> +	 */
> +	spin_lock(&kvm->mn_invalidate_lock);
> +	prepare_to_rcuwait(&kvm->mn_memslots_update_rcuwait);
> +	while (kvm->mn_active_invalidate_count) {
> +		set_current_state(TASK_UNINTERRUPTIBLE);
> +		spin_unlock(&kvm->mn_invalidate_lock);
> +		schedule();
> +		spin_lock(&kvm->mn_invalidate_lock);
> +	}
> +	finish_rcuwait(&kvm->mn_memslots_update_rcuwait);
>  	rcu_assign_pointer(kvm->memslots[as_id], slots);
> -	up_write(&kvm->mmu_notifier_slots_lock);
> +	spin_unlock(&kvm->mn_invalidate_lock);
>  	synchronize_srcu_expedited(&kvm->srcu);
> 

  reply	other threads:[~2021-04-19 15:09 UTC|newest]

Thread overview: 84+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-02  0:56 [PATCH v2 00/10] KVM: Consolidate and optimize MMU notifiers Sean Christopherson
2021-04-02  0:56 ` Sean Christopherson
2021-04-02  0:56 ` Sean Christopherson
2021-04-02  0:56 ` Sean Christopherson
2021-04-02  0:56 ` [PATCH v2 01/10] KVM: Assert that notifier count is elevated in .change_pte() Sean Christopherson
2021-04-02  0:56   ` Sean Christopherson
2021-04-02  0:56   ` Sean Christopherson
2021-04-02  0:56   ` Sean Christopherson
2021-04-02 11:08   ` Paolo Bonzini
2021-04-02 11:08     ` Paolo Bonzini
2021-04-02 11:08     ` Paolo Bonzini
2021-04-02 11:08     ` Paolo Bonzini
2021-04-02  0:56 ` [PATCH v2 02/10] KVM: Move x86's MMU notifier memslot walkers to generic code Sean Christopherson
2021-04-02  0:56   ` Sean Christopherson
2021-04-02  0:56   ` Sean Christopherson
2021-04-02  0:56   ` Sean Christopherson
2021-04-02  0:56 ` [PATCH v2 03/10] KVM: arm64: Convert to the gfn-based MMU notifier callbacks Sean Christopherson
2021-04-02  0:56   ` Sean Christopherson
2021-04-02  0:56   ` Sean Christopherson
2021-04-02  0:56   ` Sean Christopherson
2021-04-12 10:12   ` Marc Zyngier
2021-04-12 10:12     ` Marc Zyngier
2021-04-12 10:12     ` Marc Zyngier
2021-04-12 10:12     ` Marc Zyngier
2021-04-02  0:56 ` [PATCH v2 04/10] KVM: MIPS/MMU: " Sean Christopherson
2021-04-02  0:56   ` Sean Christopherson
2021-04-02  0:56   ` Sean Christopherson
2021-04-02  0:56   ` Sean Christopherson
2021-04-02  0:56 ` [PATCH v2 05/10] KVM: PPC: " Sean Christopherson
2021-04-02  0:56   ` Sean Christopherson
2021-04-02  0:56   ` Sean Christopherson
2021-04-02  0:56   ` Sean Christopherson
2021-04-02  0:56 ` [PATCH v2 06/10] KVM: Kill off the old hva-based " Sean Christopherson
2021-04-02  0:56   ` Sean Christopherson
2021-04-02  0:56   ` Sean Christopherson
2021-04-02  0:56   ` Sean Christopherson
2021-04-02  0:56 ` [PATCH v2 07/10] KVM: Move MMU notifier's mmu_lock acquisition into common helper Sean Christopherson
2021-04-02  0:56   ` Sean Christopherson
2021-04-02  0:56   ` Sean Christopherson
2021-04-02  0:56   ` Sean Christopherson
2021-04-02  9:35   ` Paolo Bonzini
2021-04-02  9:35     ` Paolo Bonzini
2021-04-02  9:35     ` Paolo Bonzini
2021-04-02  9:35     ` Paolo Bonzini
2021-04-02 14:59     ` Sean Christopherson
2021-04-02 14:59       ` Sean Christopherson
2021-04-02 14:59       ` Sean Christopherson
2021-04-02 14:59       ` Sean Christopherson
2021-04-02  0:56 ` [PATCH v2 08/10] KVM: Take mmu_lock when handling MMU notifier iff the hva hits a memslot Sean Christopherson
2021-04-02  0:56   ` Sean Christopherson
2021-04-02  0:56   ` Sean Christopherson
2021-04-02  0:56   ` Sean Christopherson
2021-04-02  0:56 ` [PATCH v2 09/10] KVM: Don't take mmu_lock for range invalidation unless necessary Sean Christopherson
2021-04-02  0:56   ` Sean Christopherson
2021-04-02  0:56   ` Sean Christopherson
2021-04-02  0:56   ` Sean Christopherson
2021-04-02  9:34   ` Paolo Bonzini
2021-04-02  9:34     ` Paolo Bonzini
2021-04-02  9:34     ` Paolo Bonzini
2021-04-02  9:34     ` Paolo Bonzini
2021-04-02 14:59     ` Sean Christopherson
2021-04-02 14:59       ` Sean Christopherson
2021-04-02 14:59       ` Sean Christopherson
2021-04-02 14:59       ` Sean Christopherson
2021-04-19  8:49   ` Wanpeng Li
2021-04-19  8:49     ` Wanpeng Li
2021-04-19  8:49     ` Wanpeng Li
2021-04-19  8:49     ` Wanpeng Li
2021-04-19 13:50     ` Paolo Bonzini
2021-04-19 15:09       ` Sean Christopherson [this message]
2021-04-19 22:09         ` Paolo Bonzini
2021-04-20  1:17           ` Sean Christopherson
2021-04-02  0:56 ` [PATCH v2 10/10] KVM: x86/mmu: Allow yielding during MMU notifier unmap/zap, if possible Sean Christopherson
2021-04-02  0:56   ` Sean Christopherson
2021-04-02  0:56   ` Sean Christopherson
2021-04-02  0:56   ` Sean Christopherson
2021-04-02 12:17 ` [PATCH v2 00/10] KVM: Consolidate and optimize MMU notifiers Paolo Bonzini
2021-04-02 12:17   ` Paolo Bonzini
2021-04-02 12:17   ` Paolo Bonzini
2021-04-02 12:17   ` Paolo Bonzini
2021-04-12 10:27   ` Marc Zyngier
2021-04-12 10:27     ` Marc Zyngier
2021-04-12 10:27     ` Marc Zyngier
2021-04-12 10:27     ` Marc Zyngier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YH2dDRBXJcbUcbLi@google.com \
    --to=seanjc@google.com \
    --cc=bgardon@google.com \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=kernellwp@gmail.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.