Kernel KVM virtualization development
 help / color / mirror / Atom feed
From: Sascha Bischoff <Sascha.Bischoff@arm.com>
To: "peter.maydell@linaro.org" <peter.maydell@linaro.org>
Cc: "yuzenghui@huawei.com" <yuzenghui@huawei.com>,
	Timothy Hayes <Timothy.Hayes@arm.com>,
	Suzuki Poulose <Suzuki.Poulose@arm.com>, nd <nd@arm.com>,
	"lpieralisi@kernel.org" <lpieralisi@kernel.org>,
	"kvmarm@lists.linux.dev" <kvmarm@lists.linux.dev>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	Joey Gouly <Joey.Gouly@arm.com>,
	"maz@kernel.org" <maz@kernel.org>,
	"oliver.upton@linux.dev" <oliver.upton@linux.dev>
Subject: Re: [PATCH 43/43] Documentation: KVM: Add the VGICv5 IRS save/restore sequences
Date: Fri, 8 May 2026 17:10:55 +0000	[thread overview]
Message-ID: <cb9c0e76068f94d14f071573102e3297aa00d0c6.camel@arm.com> (raw)
In-Reply-To: <CAFEAcA8m1Y-8=+7RZjte1WBwycgnHEx_+yiU_ieriZr5DbQKsQ@mail.gmail.com>

On Thu, 2026-04-30 at 09:57 +0100, Peter Maydell wrote:
> On Mon, 27 Apr 2026 at 17:22, Sascha Bischoff
> <Sascha.Bischoff@arm.com> wrote:
> > 
> > When saving/restoring the state of the GICv5 IRS, it is important
> > that
> > it happens in the correct order. Failure to do so will almost
> > certainly result in failing to restore a guest that is capable of
> > handling interrupts correctly.
> > 
> > On a save, the ISTs must be saved prior to saving the guest's
> > memory
> > as the guest's LPI IST is written to guest memory. Conversely, on
> > restore the guest's memory must be restored prior to restoring the
> > ISTs.
> > 
> > It is important to restore the IRS MMIO registers by first
> > restoring
> > the IRS_IDx registers as they define the capabilities of the IRS,
> > and
> > are used as part of creating and managing ISTs and SPIs.
> > 
> > In order to restore the ISTs themselves, the IRS_IST_CFGR must be
> > restored prior to the IRS_IST_BASER. This is because KVM extracts
> > fields from the CFGR to determine the size and structure of the IRS
> > created by the guest. The IST itself is created as part of the
> > write
> > to the IRS_IST_BASER. At this stage the remaining MMIO registers
> > can
> > be restored.
> > 
> > Once the LPI IST has been created (by the aforementioned write to
> > the
> > IRS_IST_BASER), the IST state can be restored using
> > KVM_DEV_ARM_VGIC_GRP_IST. The SPI IST gets extracted from a
> > userspace
> > provided buffer, and is transferred to the host-allocated SPI IST.
> > The
> > LPI IST is extracted from guest memory, and is written to the
> > host-allocated LPI IST.
> > 
> > As a general rule, the IRS_*_STATUSR registers can be ignored on
> > restore. They are not userspace writable.
> > 
> > Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com>
> > ---
> >  .../virt/kvm/devices/arm-vgic-v5.rst          | 63
> > +++++++++++++++++++
> >  1 file changed, 63 insertions(+)
> > 
> > diff --git a/Documentation/virt/kvm/devices/arm-vgic-v5.rst
> > b/Documentation/virt/kvm/devices/arm-vgic-v5.rst
> > index 38eef7cc63e3e..1c55f5040757d 100644
> > --- a/Documentation/virt/kvm/devices/arm-vgic-v5.rst
> > +++ b/Documentation/virt/kvm/devices/arm-vgic-v5.rst
> > @@ -201,3 +201,66 @@ Groups:
> >        -ENOMEM      Restoring IST state failed while tracking
> > pending interrupts
> >        -ETIMEDOUT   An IRS save/VM operation timed out
> >        =========== 
> > ============================================================
> > +
> > +IRS Save Sequence:
> > +------------------
> > +
> > +The following ordering should be followed when saving the virtual
> > GICv5 and
> > +IRS:
> > +
> > +a) Save the ISTs by issuing KVM_GET_DEVICE_ATTR on
> > KVM_DEV_ARM_VGIC_GRP_IST.
> > +   This MUST happen before the guest's memory is serialised as the
> > LPI IST is
> > +   stored directly to guest memory.
> > +
> > +b) Save the IRS MMIO register state in the following order by
> > issuing
> > +   KVM_GET_DEVICE_ATTR on KVM_DEV_ARM_VGIC_GRP_IRS_REGS:
> > +
> > +     1. Save IRS_IDR0-2 and IRS_IDR5-7 registers.
> > +     2. Save IRS_IST_CFGR.
> > +     3. Save IRS_IST_BASER.
> > +     4. Save the remaining global IRS MMIO registers.
> > +     5. For each PE:
> > +        - write IRS_PE_SELR
> > +        - save IRS_PE_CR0
> > +     6. For each SPI:
> > +        - write IRS_SPI_SELR
> > +        - save IRS_SPI_CFGR
> > +
> > +IRS Restore Sequence:
> > +---------------------
> > +
> > +The following ordering must be followed when restoring the virtual
> > GICv5 and
> > +IRS:
> > +
> > +a) restore all guest memory and create vcpus
> > +b) provide the IRS base address by issuing KVM_SET_DEVICE_ATTR on
> > +   KVM_DEV_ARM_VGIC_GRP_ADDR
> > +c) initialise the GIC - this sets up the default state and creates
> > the SPI
> > +   IST - by issuing KVM_SET_DEVICE_ATTR on
> > KVM_DEV_ARM_VGIC_GRP_CTRL with
> > +   KVM_DEV_ARM_VGIC_CTRL_INIT
> 
> This isn't going to work for QEMU, if I understand it correctly.
> QEMU always creates the whole VM first, including creating the
> VCPUs and GIC, telling KVM what its base address is, initializing it,
> etc, before it starts an inbound migration. So the memory read
> is going to come in after step (c), not right at the start.

Hi Peter,

Thanks for the feedback, and excuse my slow response.

So, just to make sure I understand, QEMU does:

a) Create VCPUs
b) Create GIC, supply GIC base address
c) Init GIC
d) Restore guest memory
e) Restore state

If so, I think things can still work broadly as I'd intended as this
part of the ordering can be changed.

One a save, the guest's LPI IST is written to guest memory. On restore,
it needs to be read back from there. Therefore, said memory needs to be
available at the point that one restores the ISTs. The MMIO regs convey
the size of the IST, and hence those need to be restored before the
ISTs themselves.

I think we could do:

a) Create VCPUs
b) Create GIC, supply GIC base address
c) Init GIC
d) Restore guest memory
e) Restore IRS MMIO regs
f) Restore the ISTs by issuing KVM_SET_DEVICE_ATTR on
KVM_DEV_ARM_VGIC_GRP_IST.

> 
> > +d) restore the IRS MMIO register state in the following order by
> > issuing
> > +   KVM_SET_DEVICE_ATTR on KVM_DEV_ARM_VGIC_GRP_IRS_REGS:
> > +
> > +     1. Restore IRS_IDR0-2 and IRS_IDR5-7 registers.
> > +     2. Restore IRS_IST_CFGR.
> > +     3. Restore IRS_IST_BASER - this triggers KVM to create the
> > LPI IST.
> > +
> > +e) restore the ISTs by issuing KVM_SET_DEVICE_ATTR on
> > +   KVM_DEV_ARM_VGIC_GRP_IST.
> > +f) restore the remaining IRS MMIO register state in the following
> > order by
> > +   issuing KVM_SET_DEVICE_ATTR on KVM_DEV_ARM_VGIC_GRP_IRS_REGS:
> > +
> > +     1. Restore the remaining global IRS MMIO registers.
> > +     2. For each PE:
> > +        - write IRS_PE_SELR
> > +        - restore IRS_PE_CR0
> > +     3. For each SPI:
> > +        - write IRS_SPI_SELR
> > +        - restore IRS_SPI_CFGR
> 
> More generally, if your API involves this much in the way
> of complicated ordering dependencies, it's going to be
> very bug prone.  From userspace's perspective, this is
> not a very helpful way to design the interface :-)

I think it can be simplified to what I have put above.

One of the reasons for the complexity was that I was trying to re-use
the existing LPI IST creation mechanism - on a write to the
IRS_IST_BASER, but that's not necessary. For example, that allocation
could made part of the KVM_DEV_ARM_VGIC_GRP_IST ioctl, removing the
ordering requirement for the IRS_IST_BASER and IRS_IST_CFGR regs
altogether. Moreover, I think the IRS_PE_SELR, IRS_PE_CR0 and the
IRS_SPI_SELR, IRS_SPI_CFGR loops can both be omitted. This would allow
bulk MMIO save/restore for the IRS.

The IRS_PE_* loop was there to handle the case where a guest might've
not opted out of 1-of-N interrupt selection. However, supporting 1-of-N
selection is complex, and is not planned for KVM. As long as we are
happy to say that 1-of-N is not supported, then this can be removed. If
this were added in the future, this part of save/restore would need to
be revisited.

As for the other (SPI) loop, the intent here was to make sure that KVM
correctly tracks the state of level-sensitive and edge-triggered SPIs.
However, that information already exists in the IST that is being
restored, so if needs be, that can be extracted as part of restoring
the IST.

In summary, I think that the restore sequence can be simplified to what
I have put above, which is effectively bulk MMIO restore followed by
IST restore. Please let me know if you think that would work for QEMU.

FWIW, I think the save sequence can be simplified to:

a) Save the ISTs by issuing KVM_GET_DEVICE_ATTR on
KVM_DEV_ARM_VGIC_GRP_IST.
   This MUST happen before the guest's memory is serialised as the
LPI IST is stored directly to guest memory.
b) Save the IRS MMIO register state by
issuing KVM_GET_DEVICE_ATTR on KVM_DEV_ARM_VGIC_GRP_IRS_REGS:

Where a) & b) could happen in either order as long as the memory is
saved after the IST has been written to it.

> 
> thanks
> -- PMM

Thanks again for the feedback.
Sascha


      reply	other threads:[~2026-05-08 17:12 UTC|newest]

Thread overview: 77+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-27 16:06 [PATCH 00/43] KVM: arm64: Add GICv5 IRS support Sascha Bischoff
2026-04-27 16:06 ` [PATCH 01/43] arm64/sysreg: Add GICv5 GIC VDPEND and VDRCFG encodings Sascha Bischoff
2026-04-27 16:06 ` [PATCH 02/43] arm64/sysreg: Update ICC_CR0_EL1 with LINK and LINK_IDLE fields Sascha Bischoff
2026-04-27 16:07 ` [PATCH 03/43] KVM: arm64: gic-v5: Add resident/non-resident hyp calls Sascha Bischoff
2026-04-28 14:28   ` Marc Zyngier
2026-05-01 16:40     ` Sascha Bischoff
2026-04-27 16:07 ` [PATCH 04/43] irqchip/gic-v5: Provide IRS config frame attrs to KVM Sascha Bischoff
2026-04-28 14:56   ` Marc Zyngier
2026-05-01 16:46     ` Sascha Bischoff
2026-04-27 16:07 ` [PATCH 05/43] KVM: arm64: gic-v5: Extract host IRS caps from IRS config frame Sascha Bischoff
2026-04-28 15:20   ` Marc Zyngier
2026-05-01 16:44     ` Sascha Bischoff
2026-04-27 16:08 ` [PATCH 06/43] KVM: arm64: gic-v5: Add VPE doorbell domain Sascha Bischoff
2026-04-28 16:40   ` Marc Zyngier
2026-05-01 16:54     ` Sascha Bischoff
2026-04-27 16:08 ` [PATCH 07/43] KVM: arm64: gic-v5: Create & manage VM and VPE tables Sascha Bischoff
2026-04-28 14:54   ` Vladimir Murzin
2026-05-01 16:42     ` Sascha Bischoff
2026-04-28 15:55   ` Joey Gouly
2026-05-08 12:42     ` Sascha Bischoff
2026-04-29 10:25   ` Marc Zyngier
2026-05-08 12:37     ` Sascha Bischoff
2026-04-27 16:08 ` [PATCH 08/43] KVM: arm64: gic-v5: Introduce guest IST alloc and management Sascha Bischoff
2026-04-29 14:29   ` Marc Zyngier
2026-05-08 12:43     ` Sascha Bischoff
2026-04-27 16:09 ` [PATCH 09/43] KVM: arm64: gic-v5: Implement VMT/vIST IRS MMIO Ops Sascha Bischoff
2026-04-29 12:50   ` Joey Gouly
2026-05-08 12:38     ` Sascha Bischoff
2026-04-29 16:04   ` Marc Zyngier
2026-05-08 13:31     ` Sascha Bischoff
2026-04-27 16:09 ` [PATCH 10/43] KVM: arm64: gic-v5: Implement VPE " Sascha Bischoff
2026-04-30  8:46   ` Marc Zyngier
2026-05-08 17:11     ` Sascha Bischoff
2026-04-27 16:09 ` [PATCH 11/43] KVM: arm64: gic-v5: Make VPEs valid in vgic_v5_reset() Sascha Bischoff
2026-04-30  9:37   ` Marc Zyngier
2026-05-08 17:08     ` Sascha Bischoff
2026-04-27 16:10 ` [PATCH 12/43] KVM: arm64: gic-v5: Clear db_fired flag before making VPE non-resident Sascha Bischoff
2026-04-27 16:10 ` [PATCH 13/43] KVM: arm64: gic-v5: Make VPEs (non-)resident in vgic_load/put Sascha Bischoff
2026-04-30 10:26   ` Marc Zyngier
2026-05-08 17:07     ` Sascha Bischoff
2026-04-27 16:10 ` [PATCH 14/43] KVM: arm64: gic-v5: Request VPE doorbells when going non-resident Sascha Bischoff
2026-04-30 10:37   ` Marc Zyngier
2026-04-27 16:11 ` [PATCH 15/43] KVM: arm64: gic-v5: Handle doorbells in kvm_vgic_vcpu_pending_irq() Sascha Bischoff
2026-04-27 16:11 ` [PATCH 16/43] KVM: arm64: gic-v5: Initialise and teardown VMTEs & doorbells Sascha Bischoff
2026-04-30 12:23   ` Marc Zyngier
2026-04-27 16:11 ` [PATCH 17/43] KVM: arm64: gic-v5: Enable VPE DBs on VPE reset and disable on teardown Sascha Bischoff
2026-05-06 15:03   ` Marc Zyngier
2026-04-27 16:12 ` [PATCH 18/43] KVM: arm64: gic-v5: Define remaining IRS MMIO registers Sascha Bischoff
2026-05-07 15:10   ` Marc Zyngier
2026-04-27 16:12 ` [PATCH 19/43] KVM: arm64: gic-v5: Introduce struct vgic_v5_irs and IRS base address Sascha Bischoff
2026-04-27 16:12 ` [PATCH 20/43] KVM: arm64: gic-v5: Add IRS IODEV to iodev_types and generic MMIO handlers Sascha Bischoff
2026-04-27 16:13 ` [PATCH 21/43] KVM: arm64: gic-v5: Add KVM_VGIC_V5_ADDR_TYPE_IRS to UAPI Sascha Bischoff
2026-04-27 16:13 ` [PATCH 22/43] KVM: arm64: gic-v5: Add GICv5 IRS IODEV and MMIO emulation Sascha Bischoff
2026-04-27 16:13 ` [PATCH 23/43] KVM: arm64: gic-v5: Set IRICHPPIDIS based on IRS enable state Sascha Bischoff
2026-04-27 16:14 ` [PATCH 24/43] KVM: arm64: gic-v5: Call IRS init/teardown from vgic_v5 init/teardown Sascha Bischoff
2026-04-27 16:14 ` [PATCH 25/43] KVM: arm64: gic-v5: Register the IRS IODEV Sascha Bischoff
2026-04-27 16:14 ` [PATCH 26/43] Documentation: KVM: Extend VGICv5 docs for KVM_VGIC_V5_ADDR_TYPE_IRS Sascha Bischoff
2026-04-27 16:15 ` [PATCH 27/43] KVM: arm64: selftests: Update vGICv5 selftest to set IRS address Sascha Bischoff
2026-04-27 16:15 ` [PATCH 28/43] KVM: arm64: gic-v5: Introduce SPI AP list Sascha Bischoff
2026-04-27 16:15 ` [PATCH 29/43] KVM: arm64: gic-v5: Add GIC VDPEND and GIC VDRCFG hyp calls Sascha Bischoff
2026-04-27 16:16 ` [PATCH 30/43] KVM: arm64: gic-v5: Track SPI state for in-flight SPIs Sascha Bischoff
2026-04-27 16:16 ` [PATCH 31/43] KVM: arm64: gic: Introduce set_pending_state() to irq_op Sascha Bischoff
2026-04-27 16:16 ` [PATCH 32/43] KVM: arm64: gic-v5: Support SPI injection Sascha Bischoff
2026-04-27 16:17 ` [PATCH 33/43] KVM: arm64: gic-v5: Add GICv5 SPI injection to irqfd Sascha Bischoff
2026-04-27 16:17 ` [PATCH 34/43] KVM: arm64: gic-v5: Mask per-vcpu PPI state in vgic_v5_finalize_ppi_state() Sascha Bischoff
2026-04-27 16:17 ` [PATCH 35/43] KVM: arm64: gic-v5: Add GICv5 EL1 sysreg userspace set/get interface Sascha Bischoff
2026-04-27 16:18 ` [PATCH 36/43] KVM: arm64: gic-v5: Implement save/restore mechanisms for ISTs Sascha Bischoff
2026-05-01 18:54   ` Vladimir Murzin
2026-04-27 16:18 ` [PATCH 37/43] KVM: arm64: gic-v5: Handle userspace accesses to IRS MMIO region Sascha Bischoff
2026-04-27 16:19 ` [PATCH 38/43] KVM: arm64: gic-v5: Add VGIC_GRP_IRS_REGS/VGIC_GRP_IST to UAPI Sascha Bischoff
2026-04-27 16:19 ` [PATCH 39/43] KVM: arm64: gic-v5: Plumb in has/set/get_attr for sysregs & IRS MMIO regs Sascha Bischoff
2026-04-27 16:19 ` [PATCH 40/43] Documentation: KVM: Document KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS for VGICv5 Sascha Bischoff
2026-04-27 16:20 ` [PATCH 41/43] Documentation: KVM: Add KVM_DEV_ARM_VGIC_GRP_IRS_REGS to VGICv5 docs Sascha Bischoff
2026-04-27 16:20 ` [PATCH 42/43] Documentation: KVM: Add docs for KVM_DEV_ARM_VGIC_GRP_IST Sascha Bischoff
2026-04-27 16:20 ` [PATCH 43/43] Documentation: KVM: Add the VGICv5 IRS save/restore sequences Sascha Bischoff
2026-04-30  8:57   ` Peter Maydell
2026-05-08 17:10     ` Sascha Bischoff [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cb9c0e76068f94d14f071573102e3297aa00d0c6.camel@arm.com \
    --to=sascha.bischoff@arm.com \
    --cc=Joey.Gouly@arm.com \
    --cc=Suzuki.Poulose@arm.com \
    --cc=Timothy.Hayes@arm.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.linux.dev \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=lpieralisi@kernel.org \
    --cc=maz@kernel.org \
    --cc=nd@arm.com \
    --cc=oliver.upton@linux.dev \
    --cc=peter.maydell@linaro.org \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox