public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Janosch Frank <frankja@linux.ibm.com>
To: Josephine Pfeiffer <hi@josie.lol>, borntraeger@linux.ibm.com
Cc: imbrenda@linux.ibm.com, david@kernel.org, hca@linux.ibm.com,
	gor@linux.ibm.com, agordeev@linux.ibm.com, svens@linux.ibm.com,
	kvm@vger.kernel.org, linux-s390@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] KVM: s390: Implement CHECK_STOP support and fix GET_MP_STATE
Date: Tue, 25 Nov 2025 19:10:43 +0100	[thread overview]
Message-ID: <fd5ad2be-f15f-425f-b8ef-087dc639024d@linux.ibm.com> (raw)
In-Reply-To: <20251120182849.1109773-1-hi@josie.lol>

On 11/20/25 19:28, Josephine Pfeiffer wrote:
> On Mon, 17 Nov 2025 19:14:57 +0100, Christian Borntraeger wrote:
>> Am 17.11.25 um 16:18 schrieb Josephine Pfeiffer:
>>> Add support for KVM_MP_STATE_CHECK_STOP to enable proper VM migration
>>> and error handling for s390 guests. The CHECK_STOP state represents a
>>> CPU that encountered a severe machine check and is halted in an error
>>> state.
>>
>> I think the patch description is misleading. We do have proper VM
>> migration and we also have error handling in the kvm module. The host
>> machine check handler will forward guest machine checks to the guest.
>> This logic  is certainly not perfect but kind of good enough for most
>> cases.
> 
> First of all, thank you for taking the time to look at my patch, and sorry
> for taking so long to write up the reply.
> 
> You're right, QEMU migrates cpu_state via vmstate [1] and only uses
> KVM_SET_MP_STATE to restore the state after migration [2], never calling
> KVM_GET_MP_STATE. So I misunderstood something there.
> 
> What prompted me to look into this was that the KVM API has advertised
> CHECK_STOP support without implementing it.
> Looking at commit 6352e4d2dd9a [3] from 2014: "KVM: s390: implement
> KVM_(S|G)ET_MP_STATE for user space state control"
> 
> This commit added KVM_MP_STATE_CHECK_STOP to include/uapi/linux/kvm.h [4] and
> documented it in Documentation/virtual/kvm/api.txt with:
> 
>    "KVM_MP_STATE_CHECK_STOP: the vcpu is in a special error state [s390]"
> 
> But the implementation was explicitly deferred with a fallthrough comment [3]:
> 
>    case KVM_MP_STATE_LOAD:
>    case KVM_MP_STATE_CHECK_STOP:
>        /* fall through - CHECK_STOP and LOAD are not supported yet */
>    default:
>        rc = -ENXIO;
> 
> This created a bit of an API asymmetry where:
> - Documentation/virt/kvm/api.rst:1546 [5] advertises CHECK_STOP as valid
> - KVM_SET_MP_STATE rejects it with -ENXIO
> - KVM_GET_MP_STATE never returns it (always returns STOPPED or OPERATING) [6]
> 
>> Now: The architecture defines that state and the interface is certainly
>> there. So implementing it will allow userspace to put a CPU into checkstop
>> state if you ever need that. We also have a checkstop state that you
>> can put a secure CPU in.
>>
>> The usecase is dubious though. The only case of the options from POP
>> chapter11 that makes sense to me in a virtualized environment is an exigent
>> machine check but a problem to actually deliver that (multiple reasons,
>> like the OS has machine checks disabled in PSW, or the prefix register
>> is broken).
>>
>> So I am curious, do you have any specific usecase in mind?
>> I assume you have a related QEMU patch somewhere?
> 
> The use cases I see are:
> 
> 1. API completeness: The state was added to the UAPI 11 years ago but never
>     implemented. Userspace cannot use a documented API feature.

I'd rather have stubs which properly fence than code that's never tested 
since we don't use it.

Since this never worked it might make sense to remove it since future 
users will need to check for this "feature" anyway before using it.

> 
> 2. Fault injection testing: Administrators testing failover/monitoring for
>     hardware failures could programmatically put a CPU into CHECK_STOP to
>     verify their procedures work.

How would that work?
What can we gain from putting a CPU into checkstop?
How would QEMU use this?


Checkstop is not an error communication medium, that's the machine check 
interrupt. If you want to inject faults then use the machine check 
interface.

If you want to crash the guest, then panic it or just stop cpus.

  reply	other threads:[~2025-11-25 18:10 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-17 15:18 [PATCH] KVM: s390: Implement CHECK_STOP support and fix GET_MP_STATE Josephine Pfeiffer
2025-11-17 18:14 ` Christian Borntraeger
2025-11-20 18:28   ` Josephine Pfeiffer
2025-11-25 18:10     ` Janosch Frank [this message]
2025-12-11 10:54       ` Josephine Pfeiffer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fd5ad2be-f15f-425f-b8ef-087dc639024d@linux.ibm.com \
    --to=frankja@linux.ibm.com \
    --cc=agordeev@linux.ibm.com \
    --cc=borntraeger@linux.ibm.com \
    --cc=david@kernel.org \
    --cc=gor@linux.ibm.com \
    --cc=hca@linux.ibm.com \
    --cc=hi@josie.lol \
    --cc=imbrenda@linux.ibm.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=svens@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox