From: David Hildenbrand <david@redhat.com>
To: Eric Farman <farman@linux.ibm.com>,
Christian Borntraeger <borntraeger@de.ibm.com>,
Janosch Frank <frankja@linux.ibm.com>,
Claudio Imbrenda <imbrenda@linux.ibm.com>,
Thomas Huth <thuth@redhat.com>
Cc: Heiko Carstens <hca@linux.ibm.com>,
Vasily Gorbik <gor@linux.ibm.com>,
kvm@vger.kernel.org, linux-s390@vger.kernel.org
Subject: Re: [RFC PATCH v3 2/2] KVM: s390: Extend the USER_SIGP capability
Date: Mon, 22 Nov 2021 11:52:52 +0100 [thread overview]
Message-ID: <858e4f2b-d601-a4f1-9e80-8f7838299c9a@redhat.com> (raw)
In-Reply-To: <cd1c11a05cc13fb8c70ce3644dcf823a840872b5.camel@linux.ibm.com>
On 19.11.21 21:20, Eric Farman wrote:
> On Wed, 2021-11-17 at 08:54 +0100, Christian Borntraeger wrote:
>> Am 11.11.21 um 20:05 schrieb Eric Farman:
>>> On Thu, 2021-11-11 at 19:29 +0100, David Hildenbrand wrote:
>>>> On 11.11.21 18:48, Eric Farman wrote:
>>>>> On Thu, 2021-11-11 at 17:13 +0100, Janosch Frank wrote:
>>>>>>
>>>>>> Looking at the API I'd like to avoid having two IOCTLs
>>>>>
>>>>> Since the order is a single byte, we could have the payload of
>>>>> an
>>>>> ioctl
>>>>> say "0-255 is an order that we're busy processing, anything
>>>>> higher
>>>>> than
>>>>> that resets the busy" or something. That would remove the need
>>>>> for
>>>>> a
>>>>> second IOCTL.
>>>>
>>>> Maybe just pass an int and treat a negative (or just -1) value as
>>>> clearing the order.
>>>>
>>>
>>> Right, that's exactly what I had at one point. I thought it was too
>>> cumbersome, but maybe not. Will dust it off, pending my question to
>>> Janosch about 0-vs-1 IOCTLs.
>>
>> As a totally different idea. Would a sync_reg value called SIGP_BUSY
>> work as well?
>>
>
> Hrm... I'm not sure. I played with it a bit, and it's not looking
> great. I'm almost certainly missing some serialization, because I was
> frequently "losing" one of the toggles (busy/not-busy) when hammering
> CPUs with various SIGP orders on this interface and thus getting
> incorrect responses from the in-kernel orders.
You can only modify the destination CPU from the destination CPU thread,
after synchronizing the CPU state. This would be trivial with my QEMU proposal.
>
> I also took a stab at David's idea of tying it to KVM_MP_STATE [1]. I
> still think it's a little odd, considering the existing states are all
> z/Arch-defined CPU states, but it does sound like the sort of thing
> we're trying to do (letting userspace announce what the CPU is up to).
> One flaw is that most of the rest of QEMU uses s390_cpu_set_state() for
> this, which returns the number of running CPUs instead of the return
> code from the MP_STATE ioctl (via kvm_s390_set_cpu_state()) that SIGP
> would be interested in. Even if I made the ioctl call directly, I still
> encounter some system problems that smell like ones I've addressed in
> v2 and v3. Possibly fixable, but I didn't pursue them far enough to be
> certain.
Well, we can essentially observe this special state of that CPU ("stopping"), so
it's not that weird. STOPPING is essentially OPERATING with the notion of
"the CPU is blocked for some actions.".
>
> I ALSO took a stab at folding this into the S390 IRQ paths [2], similar
> to what was done with kvm_s390_stop_info. This worked reasonably well,
> except the QEMU interface kvm_s390_vcpu_interrupt() returns a void, and
> so wouldn't notice an error sent back by KVM. Not a deal breaker, but
> having not heard anything to this idea, I didn't go much farther.
We only care about SIGP STOP* handling so far, if anybody is aware of other issues
that need fixing, it would be helpful to spell them out. I'll keep assuming that
only SIGP STOP* needs fixing, as I already explained.
Whenever QEMU tells a CPU to stop asynchronously, it does so via a STOP IRQ from
the destination CPU thread via KVM_S390_SIGP_STOP. Then, we know the CPU is busy
... until we clear that interrupt, which happens via kvm_s390_clear_stop_irq().
Interestingly, we clear that interrupt via two paths:
1) kvm_s390_clear_local_irqs(), via KVM_S390_INITIAL_RESET and
KVM_S390_NORMAL_RESET. Here, we expect that new user space also sets
the CPU to stopped via KVM_MP_STATE_STOPPED. In fact, modern QEMU
properly sets the CPU stopped before triggering clearing of the
interrupts (s390_cpu_reset()).
2) kvm_s390_clear_stop_irq() via kvm_s390_vcpu_stop(), which is
triggered via:
a) STOP intercept (handle_stop()), KVM_S390_INITIAL_RESET and
KVM_S390_NORMAL_RESET with old user space --
!kvm_s390_user_cpu_state_ctrl().
b) KVM_MP_STATE_STOPPED for modern user space.
Would the following solve our SIGP STOP * issue w.o. uapi changes?
a) Kernel
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index c6257f625929..bd7ee1ea8aa8 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -4643,10 +4643,15 @@ int kvm_s390_vcpu_stop(struct kvm_vcpu *vcpu)
}
}
- /* SIGP STOP and SIGP STOP AND STORE STATUS has been fully processed */
+ /*
+ * SIGP STOP and SIGP STOP AND STORE STATUS have been fully
+ * processed. Clear the interrupt after setting the VCPU stopped,
+ * such that the VCPU remains busy for most SIGP orders until fully
+ * stopped.
+ */
+ kvm_s390_set_cpuflags(vcpu, CPUSTAT_STOPPED);
kvm_s390_clear_stop_irq(vcpu);
- kvm_s390_set_cpuflags(vcpu, CPUSTAT_STOPPED);
__disable_ibs_on_vcpu(vcpu);
for (i = 0; i < online_vcpus; i++) {
diff --git a/arch/s390/kvm/sigp.c b/arch/s390/kvm/sigp.c
index cf4de80bd541..e40f6412106d 100644
--- a/arch/s390/kvm/sigp.c
+++ b/arch/s390/kvm/sigp.c
@@ -276,6 +276,38 @@ static int handle_sigp_dst(struct kvm_vcpu *vcpu, u8 order_code,
if (!dst_vcpu)
return SIGP_CC_NOT_OPERATIONAL;
+ /*
+ * SIGP STOP * orders are the only SIGP orders that are processed
+ * asynchronously, and can theoretically, never complete.
+ *
+ * Until the destination VCPU is stopped via kvm_s390_vcpu_stop(), we
+ * have a stop interrupt pending. While we have a pending stop
+ * interrupt, that CPU is busy for most SIGP orders.
+ *
+ * This is important, because otherwise a single VCPU could issue on an
+ * operating destination VCPU:
+ * 1) SIGP STOP $DEST
+ * 2) SIGP SENSE $DEST
+ * And have 2) not rejected with BUSY although the destination is still
+ * processing the pending SIGP STOP * order.
+ *
+ * Relevant code has to make sure to complete the SIGP STOP * action
+ * (e.g., setting the CPU stopped, storing the status) before clearing
+ * the STOP interrupt.
+ */
+ if (order_code != SIGP_INITIAL_CPU_RESET &&
+ order_code != SIGP_CPU_RESET) {
+ /*
+ * Lockless check. SIGP STOP / SIGP RE(START) properly
+ * synchronizes when processing these orders. In any other case,
+ * we don't particularly care about races, as the guest cannot
+ * observe the difference really when issuing orders from two
+ * differing VCPUs.
+ */
+ if (kvm_s390_is_stop_irq_pending(dst_vcpu))
+ return SIGP_CC_BUSY;
+ }
+
switch (order_code) {
case SIGP_SENSE:
vcpu->stat.instruction_sigp_sense++;
b) QEMU
diff --git a/target/s390x/sigp.c b/target/s390x/sigp.c
index 51c727834c..e97e3a60fd 100644
--- a/target/s390x/sigp.c
+++ b/target/s390x/sigp.c
@@ -479,13 +479,17 @@ void do_stop_interrupt(CPUS390XState *env)
{
S390CPU *cpu = env_archcpu(env);
- if (s390_cpu_set_state(S390_CPU_STATE_STOPPED, cpu) == 0) {
- qemu_system_shutdown_request(SHUTDOWN_CAUSE_GUEST_SHUTDOWN);
- }
+ /*
+ * Complete the STOP operation before exposing the CPU as STOPPED to
+ * the system.
+ */
if (cpu->env.sigp_order == SIGP_STOP_STORE_STATUS) {
s390_store_status(cpu, S390_STORE_STATUS_DEF_ADDR, true);
}
env->sigp_order = 0;
+ if (s390_cpu_set_state(S390_CPU_STATE_STOPPED, cpu) == 0) {
+ qemu_system_shutdown_request(SHUTDOWN_CAUSE_GUEST_SHUTDOWN);
+ }
env->pending_int &= ~INTERRUPT_STOP;
}
--
Thanks,
David / dhildenb
next prev parent reply other threads:[~2021-11-22 10:52 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-11-10 20:33 [RFC PATCH v3 0/2] s390x: Improvements to SIGP handling [KVM] Eric Farman
2021-11-10 20:33 ` [RFC PATCH v3 1/2] Capability/IOCTL/Documentation Eric Farman
2021-11-10 20:33 ` [RFC PATCH v3 2/2] KVM: s390: Extend the USER_SIGP capability Eric Farman
2021-11-11 9:15 ` David Hildenbrand
2021-11-11 15:03 ` Eric Farman
2021-11-11 16:13 ` Janosch Frank
2021-11-11 17:48 ` Eric Farman
2021-11-11 18:29 ` David Hildenbrand
2021-11-11 19:05 ` Eric Farman
2021-11-11 19:15 ` David Hildenbrand
2021-11-11 19:44 ` Eric Farman
2021-11-12 9:34 ` David Hildenbrand
2021-11-12 9:35 ` David Hildenbrand
2021-11-17 7:54 ` Christian Borntraeger
2021-11-19 20:20 ` Eric Farman
2021-11-22 10:52 ` David Hildenbrand [this message]
2021-11-23 17:42 ` Eric Farman
2021-11-23 18:44 ` David Hildenbrand
2021-11-30 20:11 ` Eric Farman
2021-11-12 8:49 ` Janosch Frank
2021-11-12 16:09 ` Eric Farman
2021-11-12 20:30 ` Eric Farman
2021-11-11 16:16 ` Janosch Frank
2021-11-11 17:50 ` Eric Farman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=858e4f2b-d601-a4f1-9e80-8f7838299c9a@redhat.com \
--to=david@redhat.com \
--cc=borntraeger@de.ibm.com \
--cc=farman@linux.ibm.com \
--cc=frankja@linux.ibm.com \
--cc=gor@linux.ibm.com \
--cc=hca@linux.ibm.com \
--cc=imbrenda@linux.ibm.com \
--cc=kvm@vger.kernel.org \
--cc=linux-s390@vger.kernel.org \
--cc=thuth@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox