Linux KVM/arm64 development list
 help / color / mirror / Atom feed
* RE: Bug? Incompatible APF for 4.14 guest on 5.10 and later host
@ 2023-10-13 15:40 Mancini, Riccardo
  0 siblings, 0 replies; 2+ messages in thread
From: Mancini, Riccardo @ 2023-10-13 15:40 UTC (permalink / raw)
  To: Gavin Shan
  Cc: kvm@vger.kernel.org, Graf (AWS), Alexander, Teragni, Matias,
	Batalov, Eugene, Marc Zyngier, Oliver Upton,
	kvmarm@lists.linux.dev, Paolo Bonzini, vkuznets@redhat.com

> Adding Marc, Oliver and kvmarm@lists.linux.dev
> 
> I tried to make the feature available to ARM64 long time ago, but the
> efforts were discontinued as the significant concern was no users
> demanding for it [1].
> It's definitely exciting news to know it's a important feature to AWS. I
> guess it's probably another chance to re-evaluate the feature for ARM64?
> 
> [1] https://lore.kernel.org/kvmarm/87iloq2oke.wl-maz@kernel.org/
> 
> Async PF needs two signals sent from host to guest, SDEI (Software
> Delegated Exception Interface) is leveraged for that. So there were two
> series to support SDEI virtualization [1] and Async PF on ARM64 [2].
> 
> [1] https://lore.kernel.org/kvmarm/20220527080253.1562538-1-
> gshan@redhat.com/
> [2] https://lore.kernel.org/kvmarm/20210815005947.83699-1-
> gshan@redhat.com/

Thanks for all the information! This might become useful in the future,
when we'll enable this feature on ARM, given the improvements we saw in x86.

> 
> I got several questions for Mancini to answer, helpful understand the
> situation better.
> 
> - VM shapshot is stored somewhere remotely. It means the page fault on
>    instruction fetch becomes expensive. Do we have benchmarks how much
>    benefits brought by Async PF on x86 in AWS environment?

In our small local repro (only local disk access) which runs a Java load after
resume of the Firecracker VM, we saw a 20% performance regression (from ~80ms 
to ~100ms) and the time spent outside the VM due to EPT_VIOLATION increased 3x 
from 30ms to 90ms. This impact is amplified when access is not local.

> 
> - I'm wandering if the data can be fetched from somewhere remotely in AWS
>    environment?

Without getting into details, yes, any memory page could be remotely accessed
in the worst case.

> 
> - The data can be stored in local DRAM or swapping space, the page fault
>    to fetch data becomes expensive if the data is stored in swapping
> space.
>    I'm not sure if it's possible the data resides in the swapping space in
>    AWS environment? Note that the swapping space, corresponding to disk,
>    could be somewhere remotely seated.

In our usage, during resumption almost all pages are missing and are populated
on demand with a userfaultfd, either from a local cache (memory or disk) or
from the network.

Thanks,
Riccardo

> 
> Thanks,
> Gavin
> 


^ permalink raw reply	[flat|nested] 2+ messages in thread
[parent not found: <1a68941c7abc4968a1e98627743256f3@amazon.com>]

end of thread, other threads:[~2023-10-13 15:40 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-13 15:40 Bug? Incompatible APF for 4.14 guest on 5.10 and later host Mancini, Riccardo
     [not found] <1a68941c7abc4968a1e98627743256f3@amazon.com>
2023-10-06  1:39 ` Gavin Shan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox