From: Paolo Bonzini <pbonzini@redhat.com>
To: Xiao Guangrong <xiaoguangrong.eric@gmail.com>
Cc: gleb@kernel.org, avi.kivity@gmail.com, mtosatti@redhat.com,
linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
stable@vger.kernel.org, David Matlack <dmatlack@google.com>
Subject: Re: [PATCH 1/2] KVM: fix cache stale memslot info with correct mmio generation number
Date: Mon, 18 Aug 2014 20:47:37 +0200 [thread overview]
Message-ID: <53F24A49.2010807@redhat.com> (raw)
In-Reply-To: <9AD43423-2FF3-422D-A5AD-61CAE6339CCC@linux.vnet.ibm.com>
Il 18/08/2014 18:35, Xiao Guangrong ha scritto:
>
> Hi Paolo,
>
> Thank you to review the patch!
>
> On Aug 18, 2014, at 9:57 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>
>> Il 14/08/2014 09:01, Xiao Guangrong ha scritto:
>>> - update_memslots(slots, new, kvm->memslots->generation);
>>> + /* ensure generation number is always increased. */
>>> + slots->generation = old_memslots->generation;
>>> + update_memslots(slots, new);
>>> rcu_assign_pointer(kvm->memslots, slots);
>>> synchronize_srcu_expedited(&kvm->srcu);
>>> + slots->generation++;
>>
>> I don't trust my brain enough to review this patch.
>
> Sorry to make you confused. I should expain it more clearly.
Don't worry, it's not your fault. :)
>> kvm_current_mmio_generation seems like a very bad (race-prone) API. One
>> patch I trust myself reviewing would change a bunch of functions in
>> kvm_main.c to take a memslots struct. This would make it easy to
>> respect the hard and fast rule of not dereferencing the same pointer
>> twice. But it would be a tedious change.
>
> kvm_set_memory_region is the only place updating memslot and
> kvm_current_mmio_generation accesses memslot by rcu-dereference,
> i do not know why other places need to take into account.
The race occurs because gfn_to_pfn_many_atomic or some other function
has already used kvm_memslots(). Calling kvm_memslots() twice is the
root cause the bug.
> I think this patch is auditable, page-fault is always called by holding
> srcu-lock so that a page fault can’t go across synchronize_srcu_expedited.
> Only these cases can happen:
>
> 1) page fault occurs before synchronize_srcu_expedited.
> In this case, vcpu will generate mmio-exit for the memslot being registered
> by the ioctl. That’s ok since the ioctl have not finished.
>
> 2) page fault occurs after synchronize_srcu_expedited and during
> increasing generation-number.
> In this case, userspace may get wrong mmio-exit (that happen if handing
> page-fault is slower that the ioctl), that’s ok too since userspace need do
> the check anyway like i said above.
>
> 3) page fault occurs after generation-number update
> that’s definitely correct. :)
>
>> Another alternative could be to use the low bit to mark an in-progress
>> change, and skip the caching if the low bit is set. Similar to a
>> seqcount (except if read_seqcount_retry fails, we just punt and not
>> retry anything), you could use it even though the memory barriers
>> provided by write_seqcount_begin/end are not too useful in this case.
>
> I do not know how the bit works, page fault will cache the memslot before
> the bit set and cache the generation-number after the bit set.
>
> Maybe i missed your idea, could you please detail it?
Something like this:
- update_memslots(slots, new, kvm->memslots->generation);
+ /* ensure generation number is always increased. */
+ slots->generation = old_memslots->generation + 1;
+ update_memslots(slots, new);
rcu_assign_pointer(kvm->memslots, slots);
synchronize_srcu_expedited(&kvm->srcu);
+ slots->generation++;
Then case 1 and 2 will just have a cache miss.
The "low bit" is really just because each slot update does 2 generation
increases.
Paolo
next prev parent reply other threads:[~2014-08-18 18:47 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-08-14 7:01 [PATCH 1/2] KVM: fix cache stale memslot info with correct mmio generation number Xiao Guangrong
2014-08-14 7:01 ` [PATCH 2/2] kvm: x86: fix stale mmio cache bug Xiao Guangrong
2014-08-14 16:25 ` David Matlack
2014-08-18 21:24 ` Paolo Bonzini
2014-08-14 7:06 ` [PATCH 1/2] KVM: fix cache stale memslot info with correct mmio generation number Xiao Guangrong
2014-08-18 13:57 ` Paolo Bonzini
2014-08-18 16:35 ` Xiao Guangrong
2014-08-18 18:20 ` David Matlack
2014-08-18 18:47 ` Paolo Bonzini [this message]
2014-08-18 19:56 ` Xiao Guangrong
2014-08-18 21:15 ` David Matlack
2014-08-18 21:24 ` Paolo Bonzini
2014-08-18 21:33 ` David Matlack
2014-08-19 3:50 ` Xiao Guangrong
2014-08-19 4:31 ` David Matlack
2014-08-19 4:41 ` Xiao Guangrong
2014-08-19 5:00 ` David Matlack
2014-08-19 5:19 ` Xiao Guangrong
2014-08-19 5:40 ` David Matlack
2014-08-19 5:55 ` Xiao Guangrong
2014-08-19 8:28 ` Paolo Bonzini
2014-08-19 8:50 ` Xiao Guangrong
2014-08-19 9:03 ` Paolo Bonzini
2014-08-20 0:29 ` Xiao Guangrong
2014-08-20 1:03 ` David Matlack
2014-08-20 8:38 ` Paolo Bonzini
-- strict thread matches above, loose matches on Subject: below --
2014-08-12 5:02 Xiao Guangrong
2014-08-12 21:18 ` David Matlack
2014-08-14 5:41 ` Xiao Guangrong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53F24A49.2010807@redhat.com \
--to=pbonzini@redhat.com \
--cc=avi.kivity@gmail.com \
--cc=dmatlack@google.com \
--cc=gleb@kernel.org \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mtosatti@redhat.com \
--cc=stable@vger.kernel.org \
--cc=xiaoguangrong.eric@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).