Re: Limitations for Running Xen on KVM Arm64

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

From: Julien Grall <julien@xen.org>
To: "haseeb.ashraf@siemens.com" <haseeb.ashraf@siemens.com>,
	Mohamed Mediouni <mohamed@unpredictable.fr>
Cc: "xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>,
	"Volodymyr_Babchuk@epam.com" <Volodymyr_Babchuk@epam.com>,
	"Driscoll, Dan" <dan.driscoll@siemens.com>,
	"Bachtel, Andrew" <andrew.bachtel@siemens.com>,
	"fahad.arslan@siemens.com" <fahad.arslan@siemens.com>,
	"noor.ahsan@siemens.com" <noor.ahsan@siemens.com>,
	"brian.sheppard@siemens.com" <brian.sheppard@siemens.com>,
	Stefano Stabellini <sstabellini@kernel.org>,
	Bertrand Marquis <Bertrand.Marquis@arm.com>,
	Michal Orzel <michal.orzel@amd.com>
Subject: Re: Limitations for Running Xen on KVM Arm64
Date: Sat, 1 Nov 2025 18:23:38 +0000	[thread overview]
Message-ID: <01527182-ccef-43a5-be55-a5450eb7919f@xen.org> (raw)
In-Reply-To: <TYZPR06MB4580126B98C6A38AA710F597E6F8A@TYZPR06MB4580.apcprd06.prod.outlook.com>

(+ the other Arm maintainers)

On 31/10/2025 13:01, haseeb.ashraf@siemens.com wrote:
> Hello,

Hi,

Before answering to the rest, would you be able to configure your e-mail 
client to quote with '>' and avoid top-posting? Otherwise, it will 
become quite difficult to follow the conversation after a few round.

> I have seen no such performance issue with nested KVM. For Xen, if this 
> can be relaxed from |vmalls12e1| to |vmalle1|, this would still be a 
> huge performance improvement. I used Ftrace to get execution time of 
> each of these handler functions:
> handle_vmalls12e1is() min-max = 1464441 - 9495486 us

To clarify, Xen is using the local TLB version. So it should be 
vmalls12e1. But it looks like KVM will treat it the same way and I 
wonder whether this could be optimized? (I don't know much about the KVM 
implementation though).

> 
> So, to summarize using HCR_EL2.FB (which Xen already enables?) and then 
> using vmalle1 instead of vmalls12e1 should resolve the issue-2 for vCPUs 
> switching on pCPUs.

I don't think HCR_EL2.FB would matter here.

> 
> Coming back to issue-1, what do you think about creating a batch version 
> of hypercall XENMEM_remove_from_physmap (other batch versions exist such 
> as for XENMEM_add_to_physmap) and doing the TLB invalidation only once 
> per this hypercall?

Before going into batching, do you have any data showing how often 
XENMEM_remove_from_physmap is called in your setup? Similar, I would be 
interested to know the number of TLBs flush within one hypercalls and 
whether the regions unmapped were contiguous.

In your previous e-mail you wrote:

 > During the creation of domu, first the domu memory is mapped onto 
dom0 domain, images are copied into it, and it is then unmapped. During 
unmapping, the TLB translations are invalidated one by one for each page 
being unmapped in XENMEM_remove_from_physmap hypercall. Here is the code 
snippet where the decision to flush TLBs is being made during removal of 
mapping.

Don't we map only the memory that is needed to copy the binaries? If 
not, then I would suggest to look at that first.

I am asking because even with batching, we may still send a few TLBs 
because:
    * We need to avoid long-running operations, so the hypercall may 
restart. So we will have to flush at mininum before every restart
    * The current way we handle batching is we will process one item at 
the time. As this may free memory (either leaf or intermediate 
page-tables), we will need to flush the TLBs first to prevent the domain 
accessing the wrong memory. This could be solved by keeping track of the 
list of memory to free. But this is going to require some work and I am 
not entirely sure this is worth it at the moment.

> I just realized that ripas2e1 is a range TLBI 
> instruction which is only supported after Armv8.4 indicated 
> by ID_AA64ISAR0_EL1.TLB == 2. So, on older architectures, full stage-2 
> invalidation would be required. For an architecture independent 
> solution, creating a batch version seems to be a better way.

I don't think we necessarily need a full stage-2 invalidation for 
processor not supporting range TLBI. We could use a series of TLBI 
IPAS2E1IS which I think is what TBLI range is meant to replace (so long 
the addresses are contiguous in the given space).

On the KVM side, it would be worth looking at whether the implementation 
can be optimized. Is this really walking block by block? Can it skip 
over large hole (e.g. if we know a level 1 entry doesn't exist, then we 
can increment by 1GB).

Cheers,

-- 
Julien Grall

next prev parent reply	other threads:[~2025-11-01 18:24 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-30  6:12 Limitations for Running Xen on KVM Arm64 haseeb.ashraf
2025-10-30 13:41 ` haseeb.ashraf
2025-10-30 18:33   ` Mohamed Mediouni
2025-10-30 23:55     ` Julien Grall
2025-10-31  0:20       ` Mohamed Mediouni
2025-10-31  0:38         ` Mohamed Mediouni
2025-10-31  9:18         ` Julien Grall
2025-10-31 11:54           ` Mohamed Mediouni
2025-11-01 17:20             ` Julien Grall
2025-10-31 13:01           ` haseeb.ashraf
2025-11-01 18:23             ` Julien Grall [this message]
2025-11-03 13:09               ` haseeb.ashraf
2025-11-03 14:30                 ` Julien Grall
2025-11-04  7:50                   ` haseeb.ashraf
2025-11-05 13:39                     ` haseeb.ashraf
2025-11-05 17:44                       ` Julien Grall
2025-10-31 15:17 ` Mohamed Mediouni
2025-11-01  2:04 ` Demi Marie Obenour

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=01527182-ccef-43a5-be55-a5450eb7919f@xen.org \
    --to=julien@xen.org \
    --cc=Bertrand.Marquis@arm.com \
    --cc=Volodymyr_Babchuk@epam.com \
    --cc=andrew.bachtel@siemens.com \
    --cc=brian.sheppard@siemens.com \
    --cc=dan.driscoll@siemens.com \
    --cc=fahad.arslan@siemens.com \
    --cc=haseeb.ashraf@siemens.com \
    --cc=michal.orzel@amd.com \
    --cc=mohamed@unpredictable.fr \
    --cc=noor.ahsan@siemens.com \
    --cc=sstabellini@kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).