All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Radim Krčmář" <rkrcmar@redhat.com>
To: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: David Hildenbrand <david@redhat.com>,
	linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
	Paolo Bonzini <pbonzini@redhat.com>,
	Andrew Jones <drjones@redhat.com>,
	Marc Zyngier <marc.zyngier@arm.com>,
	Cornelia Huck <cornelia.huck@de.ibm.com>,
	James Hogan <james.hogan@imgtec.com>,
	Paul Mackerras <paulus@ozlabs.org>,
	Christoffer Dall <christoffer.dall@linaro.org>
Subject: Re: [PATCH 4/5] KVM: add __kvm_request_needs_mb
Date: Thu, 23 Feb 2017 16:43:54 +0100	[thread overview]
Message-ID: <20170223154354.GB8342@potion> (raw)
In-Reply-To: <58da709b-7000-2b21-a107-06e278dbc3aa@de.ibm.com>

2017-02-22 20:23+0100, Christian Borntraeger:
> On 02/22/2017 04:17 PM, Radim Krčmář wrote:
>> [Oops, the end of this thread got dragged into a mark-as-read spree ...]
>> 
>> 2017-02-17 11:13+0100, David Hildenbrand:
>>>>> This is really complicated stuff, and the basic reason for it (if I
>>>>> remember correctly) is that s390x does reenable all interrupts when
>>>>> entering the sie (see kvm-s390.c:__vcpu_run()). So the fancy smp-based
>>>>> kicks don't work (as it is otherwise just racy), and if I remember
>>>>> correctly, SMP reschedule signals (s390x external calls) would be
>>>>> slower. (Christian, please correct me if I'm wrong)
>>>>
>>>> No the reason was that there are some requests that need to be handled
>>>> outside run SIE. For example one reason was the guest prefix page.
>>>> This must be mapped read/write ALL THE TIME when a guest is running,
>>>> otherwise the host might crash. So we have to exit SIE and make sure that
>>>> it does not reenter, therefore we use the RELOAD_MMU request from a notifier
>>>> that is called from page table functions, whenever memory management decides
>>>> to unmap/write protect (dirty pages tracking, reference tracking, page migration
>>>> or compaction...)
>>>>
>>>> SMP-based request wills kick out the guest, but for some thing like the
>>>> one above it will be too late.
>>>
>>> While what you said is 100% correct, I had something else in mind that
>>> hindered using vcpu_kick() and especially kvm_make_all_cpus_request().
>>> And I remember that being related to how preemption and
>>> OUTSIDE_GUEST_MODE is handled. I think this boils down to what would
>>> have to be implemented in kvm_arch_vcpu_should_kick().
>>>
>>> x86 can track the guest state using vcpu->mode, because they can be sure
>>> that the guest can't reschedule while in the critical guest entry/exit
>>> section. This is not true for s390x, as preemption is enabled. That's
>>> why vcpu->mode cannot be used in its current form to track if a VCPU is
>>> in/oustide/exiting guest mode. And kvm_make_all_cpus_request() currently
>>> relies on this setting.
>>>
>>> For now, calling vcpu_kick() on s390x will result in a BUG().
>>>
>>>
>>> On s390x, there are 3 use cases I see for requests:
>>>
>>> 1. Remote requests that need a sync
>>>
>>> Make a request, wait until SIE has been left and make sure the request
>>> will be processed before re-entering the SIE. e.g. KVM_REQ_RELOAD_MMU
>>> notifier in mmu notifier you mentioned. Also KVM_REQ_DISABLE_IBS is a
>>> candidate.
>> 
>> Btw. aren't those requests racy?
>> 
>>     void exit_sie(struct kvm_vcpu *vcpu)
>>     {
>>     	atomic_or(CPUSTAT_STOP_INT, &vcpu->arch.sie_block->cpuflags);
>> 
>> If you get stalled here and the target VCPU handles the request and
>> reenters SIE in the meantime, then you'll wait until its next exit.
>> (And miss an unbounded amount of exits in the worst case.)
>> 
>>     	while (vcpu->arch.sie_block->prog0c & PROG_IN_SIE)
>>     		cpu_relax();
>>     }
>> 
> 
> Its not racy for the purpose it was originally made for (get the vcpu 
> out of SIE before we unmap a guest prefix page) as the MMU_RELOAD handler 
> will wait for the pte lock which is held by the code that called
> kvm_s390_sync_request(KVM_REQ_MMU_RELOAD, vcpu).
> 
> We also have the guarantee that after returning from kvm_s390_sync_request
> we will have that request be handled before we reenter the guest, which is
> all we need for DISABLE_IBS. 
> 
> But yes, all non MMU_RELOAD users might wait longer, possibly several guest
> exits. We never noticed that as requests are really a seldom event. Basically
> unmapping of the guest prefix page due to paging and migration, switching 
> between 1 and more guest cpus and some other seldom events.

Ok, thanks for the info.

I don't think that we'll find too many use-cases to demand inclusion
into a generic kick/request API, so having a function that waits until a
VCPU is out of guest mode would be more suited for generic code.

  reply	other threads:[~2017-02-23 15:43 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-16 16:04 [PATCH 0/5] KVM: rename and extend vcpu->requests API Radim Krčmář
2017-02-16 16:04 ` [PATCH 1/5] KVM: change API for requests to match bit operations Radim Krčmář
2017-02-17  9:30   ` Cornelia Huck
2017-02-17  9:49     ` Andrew Jones
2017-02-17  9:52       ` Cornelia Huck
2017-02-17 15:01     ` Radim Krčmář
2017-02-16 16:04 ` [PATCH 2/5] KVM: add KVM request variants without barrier Radim Krčmář
2017-02-23 10:57   ` Paolo Bonzini
2017-02-23 15:50     ` Radim Krčmář
2017-02-16 16:04 ` [PATCH 3/5] KVM: optimize kvm_make_all_cpus_request Radim Krčmář
2017-02-16 16:04 ` [PATCH 4/5] KVM: add __kvm_request_needs_mb Radim Krčmář
2017-02-16 19:49   ` David Hildenbrand
2017-02-16 21:31     ` Radim Krčmář
2017-02-17  8:46     ` Christian Borntraeger
2017-02-17 10:13       ` David Hildenbrand
2017-02-17 10:19         ` Christian Borntraeger
2017-02-17 11:28         ` Christian Borntraeger
2017-02-22 15:17         ` Radim Krčmář
2017-02-22 19:23           ` Christian Borntraeger
2017-02-23 15:43             ` Radim Krčmář [this message]
2017-02-22 19:57           ` Christian Borntraeger
2017-02-23 10:20             ` David Hildenbrand
2017-02-23 15:39               ` Radim Krčmář
2017-02-24 11:34           ` Christoffer Dall
2017-02-24 12:46             ` Andrew Jones
2017-02-23 11:01   ` Paolo Bonzini
2017-02-23 15:52     ` Radim Krčmář
2017-02-16 16:04 ` [PATCH 5/5] KVM: add kvm_request_pending Radim Krčmář
2017-02-16 19:50   ` David Hildenbrand
2017-02-17  9:51   ` Andrew Jones
2017-02-17 14:59     ` Radim Krčmář

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170223154354.GB8342@potion \
    --to=rkrcmar@redhat.com \
    --cc=borntraeger@de.ibm.com \
    --cc=christoffer.dall@linaro.org \
    --cc=cornelia.huck@de.ibm.com \
    --cc=david@redhat.com \
    --cc=drjones@redhat.com \
    --cc=james.hogan@imgtec.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=marc.zyngier@arm.com \
    --cc=paulus@ozlabs.org \
    --cc=pbonzini@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.