Re: [PATCH 1/6] KVM: Use acquire/release semantics when accessing dirty ring GFN state

From: Marc Zyngier <maz@kernel.org>
To: Gavin Shan <gshan@redhat.com>
Cc: kvm@vger.kernel.org, catalin.marinas@arm.com,
	andrew.jones@linux.dev, will@kernel.org, shan.gavin@gmail.com,
	bgardon@google.com, dmatlack@google.com, pbonzini@redhat.com,
	zhenyzha@redhat.com, shuah@kernel.org,
	kvmarm@lists.cs.columbia.edu
Subject: Re: [PATCH 1/6] KVM: Use acquire/release semantics when accessing dirty ring GFN state
Date: Fri, 23 Sep 2022 15:40:07 +0100	[thread overview]
Message-ID: <87bkr6jgs8.wl-maz@kernel.org> (raw)
In-Reply-To: <e8ddf130-c5e1-d872-c7c8-675d40742b1e@redhat.com>

On Fri, 23 Sep 2022 00:46:58 +0100,
Gavin Shan <gshan@redhat.com> wrote:
> 
> Hi Peter,
> 
> On 9/23/22 7:38 AM, Peter Xu wrote:
> > On Thu, Sep 22, 2022 at 06:01:28PM +0100, Marc Zyngier wrote:
> >> The current implementation of the dirty ring has an implicit requirement
> >> that stores to the dirty ring from userspace must be:
> >> 
> >> - be ordered with one another
> >> 
> >> - visible from another CPU executing a ring reset
> >> 
> >> While these implicit requirements work well for x86 (and any other
> >> TSO-like architecture), they do not work for more relaxed architectures
> >> such as arm64 where stores to different addresses can be freely
> >> reordered, and loads from these addresses not observing writes from
> >> another CPU unless the required barriers (or acquire/release semantics)
> >> are used.
> >> 
> >> In order to start fixing this, upgrade the ring reset accesses:
> >> 
> >> - the kvm_dirty_gfn_harvested() helper now uses acquire semantics
> >>    so it is ordered after all previous writes, including that from
> >>    userspace
> >> 
> >> - the kvm_dirty_gfn_set_invalid() helper now uses release semantics
> >>    so that the next_slot and next_offset reads don't drift past
> >>    the entry invalidation
> >> 
> >> This is only a partial fix as the userspace side also need upgrading.
> > 
> > Paolo has one fix 4802bf910e ("KVM: dirty ring: add missing memory
> > barrier", 2022-09-01) which has already landed.
> > 
> > I think the other one to reset it was lost too.  I just posted a patch.
> > 
> > https://lore.kernel.org/qemu-devel/20220922213522.68861-1-peterx@redhat.com/
> > (link still not yet available so far, but should be)
> > 
> >> 
> >> Signed-off-by: Marc Zyngier <maz@kernel.org>
> >> ---
> >>   virt/kvm/dirty_ring.c | 4 ++--
> >>   1 file changed, 2 insertions(+), 2 deletions(-)
> >> 
> >> diff --git a/virt/kvm/dirty_ring.c b/virt/kvm/dirty_ring.c
> >> index f4c2a6eb1666..784bed80221d 100644
> >> --- a/virt/kvm/dirty_ring.c
> >> +++ b/virt/kvm/dirty_ring.c
> >> @@ -79,12 +79,12 @@ static inline void kvm_dirty_gfn_set_invalid(struct kvm_dirty_gfn *gfn)
> >>     static inline void kvm_dirty_gfn_set_dirtied(struct
> >> kvm_dirty_gfn *gfn)
> >>   {
> >> -	gfn->flags = KVM_DIRTY_GFN_F_DIRTY;
> >> +	smp_store_release(&gfn->flags, KVM_DIRTY_GFN_F_DIRTY);
> > 
> > IIUC you meant kvm_dirty_gfn_set_invalid as the comment says?
> > 
> > kvm_dirty_gfn_set_dirtied() has been guarded by smp_wmb() and AFAICT that's
> > already safe.  Otherwise looks good to me.
> > 
> 
> If I'm understanding the full context, smp_store_release() also
> enforces guard on 'gfn->flags' itself. It is needed by user space
> for the synchronization.

There are multiple things at play here:

- userspace needs a store-release when making the flags 'harvested',
  so that the kernel using a load-acquire can observe this write (and
  avoid the roach-motel effect of a non-acquire load)

- the kernel needs a store-release when making the flags 'invalid',
  preventing this write from occuring before the next_* fields have
  been sampled

On the ring production side, there is a heavy handed smp_wmb(), which
makes things pretty safe.

	M.

-- 
Without deviation from the norm, progress is not possible.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm