All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marcelo Tosatti <mtosatti@redhat.com>
To: Andy Lutomirski <luto@amacapital.net>
Cc: "kvm list" <kvm@vger.kernel.org>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Radim Krčmář" <rkrcmar@redhat.com>
Subject: Re: KVM: x86: fix kvmclock write race
Date: Fri, 17 Apr 2015 21:20:10 -0300	[thread overview]
Message-ID: <20150418002010.GA23227@amt.cnet> (raw)
In-Reply-To: <CALCETrWRZENhH2O21CRqQ8fnFkSj7ff6RH4fQZTHf84AM5kd9g@mail.gmail.com>

On Fri, Apr 17, 2015 at 05:04:29PM -0700, Andy Lutomirski wrote:
> On Fri, Apr 17, 2015 at 4:38 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> >
> > From: Radim Krčmář <rkrcmar@redhat.com>
> >
> > As noted by Andy Lutomirski, kvm does not follow the documented version
> > protocol. Fix it.
> >
> > Note: this bug results in a race which can occur if the following three
> > conditions are met:
> >
> > 1) There is KVM guest time update (there is one every 5 minutes).
> >
> > 2) Which races with a thread in the guest in the following way:
> > The execution of these 29 instructions has to take at _least_
> > 2 seconds (rebalance interval is 1 second).
> >
> > lsl    %r9w,%esi
> > mov    %esi,%r8d
> > and    $0x3f,%esi
> > and    $0xfff,%r8d
> > test   $0xfc0,%r8d
> > jne    0xa12 <vread_pvclock+210>
> > shl    $0x6,%rsi
> > mov    -0xa01000(%rsi),%r10d
> > data32 xchg %ax,%ax
> > data32 xchg %ax,%ax
> > rdtsc
> > shl    $0x20,%rdx
> > mov    %eax,%eax
> > movsbl -0xa00fe4(%rsi),%ecx
> > or     %rax,%rdx
> > sub    -0xa00ff8(%rsi),%rdx
> > mov    -0xa00fe8(%rsi),%r11d
> > mov    %rdx,%rax
> > shl    %cl,%rax
> > test   %ecx,%ecx
> > js     0xa08 <vread_pvclock+200>
> > mov    %r11d,%edx
> > movzbl -0xa00fe3(%rsi),%ecx
> > mov    -0xa00ff0(%rsi),%r11
> > mul    %rdx
> > shrd   $0x20,%rdx,%rax
> > data32 xchg %ax,%ax
> > data32 xchg %ax,%ax
> > lsl    %r9w,%edx
> >
> > 3) Scheduler moves the task, while executing these 29 instructions, to a
> > destination processor, then back to the source processor.
> >
> > 4) Source processor, after has been moved back from destination,
> > perceives data out of order as written by processor performing guest
> > time update (item 1), with string mov.
> >
> > Given the rarity of this condition, and the fact it was never observed
> > or reported, reverting pvclock vsyscall on systems whose host is
> > susceptible to the race, seems unnecessary.
> >
> > Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
> >
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index cc2c759f69a3..8658599e0024 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -1658,12 +1658,24 @@ static int kvm_guest_time_update(struct kvm_vcpu *v)
> >                 &guest_hv_clock, sizeof(guest_hv_clock))))
> >                 return 0;
> >
> > -       /*
> > -        * The interface expects us to write an even number signaling that the
> > -        * update is finished. Since the guest won't see the intermediate
> > -        * state, we just increase by 2 at the end.
> > +       /* A guest can read other VCPU's kvmclock; specification says that
> > +        * version is odd if data is being modified and even after it is
> > +        * consistent.
> > +        * We write three times to be sure.
> > +        *  1) update version to odd number
> > +        *  2) write modified data (version is still odd)
> > +        *  3) update version to even number
> > +        *
> > +        * TODO: optimize
> > +        *  - only two writes should be enough -- version is first
> > +        *  - the second write could update just version
> 
> You're relying on lots of barely-defined behavior here, since I think
> that both copies could use fast string operations.  Those are
> explicitly unordered internally, so I think you really do need three
> writes.

Correct, 3 writes are needed.


  parent reply	other threads:[~2015-04-18  0:20 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-17 23:38 KVM: x86: fix kvmclock write race Marcelo Tosatti
2015-04-18  0:04 ` Andy Lutomirski
2015-04-18  0:12   ` Marcelo Tosatti
2015-04-18  0:20   ` Marcelo Tosatti [this message]
2015-04-18  0:23     ` KVM: x86: fix kvmclock write race (v2) Marcelo Tosatti
2015-04-20 12:54       ` Radim Krčmář

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150418002010.GA23227@amt.cnet \
    --to=mtosatti@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=pbonzini@redhat.com \
    --cc=rkrcmar@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.