From: "Roger Pau Monné" <roger.pau@citrix.com>
To: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>,
Anthony PERARD <anthony.perard@vates.tech>,
Michal Orzel <michal.orzel@amd.com>,
Julien Grall <julien@xen.org>,
Stefano Stabellini <sstabellini@kernel.org>,
Tim Deegan <tim@xen.org>,
xen-devel@lists.xenproject.org
Subject: Re: [PATCH v2 00/18] x86: adventures in Address Space Isolation
Date: Fri, 17 Jan 2025 15:45:57 +0100 [thread overview]
Message-ID: <Z4ptJWmEwWdGsocD@macbook.local> (raw)
In-Reply-To: <8064a526-3525-4c48-8926-6ea99a8312a6@suse.com>
On Tue, Jan 14, 2025 at 05:20:04PM +0100, Jan Beulich wrote:
> On 08.01.2025 15:26, Roger Pau Monne wrote:
> > Hello,
> >
> > The aim of this series is to introduce the functionality required to
> > create linear mappings visible to a single pCPU.
> >
> > Doing so requires having a per-vCPU root page-table (L4), and hence
> > requires shadowing the guest selected L4 on PV guests. As follow ups
> > (and partially to ensure the per-CPU mappings work fine) the CPU stacks
> > are switched to use per-CPU mappings, so that remote stack contents are
> > not by default mapped on all page-tables (note: for this to be true the
> > directmap entries for the stack pages would need to be removed also).
> >
> > There's one known shortcoming with the presented code: migration of PV
> > guests using per-vCPU root page-tables is not working. I need to
> > introduce extra logic to deal with PV shadow mode when using unique root
> > page-tables. I don't think this should block the series however, such
> > missing functionality can always be added as follow up work.
> > paging_domctl() is adjusted to reflect this restriction.
> >
> > The main differences compared to v1 are the usage of per-vCPU root page
> > tables (as opposed to per-pCPU), and the usage of the existing perdomain
> > family of functions to manage the mappings in the per-domain slot, that
> > now becomes per-vCPU.
> >
> > All patches until 17 are mostly preparatory, I think there's a nice
> > cleanup and generalization of the creation and managing of per-domain
> > mappings, by no longer storing references to L1 page-tables in the vCPU
> > or domain struct.
>
> Since you referred me to the cover letter, I've looked back here after
> making some more progress with the series. Along with my earlier comment
> towards the need or ultimate goal, ...
>
> > Patch 13 introduces the command line option, and would need discussion
> > and integration with the sparse direct map series. IMO we should get
> > consensus on how we want the command line to look ASAP, so that we can
> > basic parsing logic in place to be used by both the work here and the
> > direct map removal series.
> >
> > As part of this series the map_domain_page() helpers are also switched
> > to create per-vCPU mappings (see patch 15), which converts an existing
> > interface into creating per-vCPU mappings. Such interface can be used
> > to hide (map per-vCPU) further data that we don't want to be part of the
> > direct map, or even shared between vCPUs of the same domain. Also all
> > existing users of the interface will already create per-vCPU mappings
> > without needing additional changes.
> >
> > Note that none of the logic introduced in the series removes entries for
> > the directmap, so even when creating the per-CPU mappings the underlying
> > physical addresses are fully accessible when using it's direct map
> > entries.
> >
> > I also haven't done any benchmarking. Doesn't seem to cripple
> > performance up to the point that XenRT jobs would timeout before
> > finishing, that the only objective reference I can provide at the
> > moment.
> >
> > The series has been extensively tested on XenRT, but that doesn't cover
> > all possible use-cases, so it's likely to still have some rough edges,
> > handle with care.
> >
> > Thanks, Roger.
> >
> > Roger Pau Monne (18):
> > x86/mm: purge unneeded destroy_perdomain_mapping()
> > x86/domain: limit window where curr_vcpu != current on context switch
> > x86/mm: introduce helper to detect per-domain L1 entries that need
> > freeing
> > x86/pv: introduce function to populate perdomain area and use it to
> > map Xen GDT
> > x86/mm: switch destroy_perdomain_mapping() parameter from domain to
> > vCPU
> > x86/pv: set/clear guest GDT mappings using
> > {populate,destroy}_perdomain_mapping()
> > x86/pv: update guest LDT mappings using the linear entries
> > x86/pv: remove stashing of GDT/LDT L1 page-tables
> > x86/mm: simplify create_perdomain_mapping() interface
> > x86/mm: switch {create,destroy}_perdomain_mapping() domain parameter
> > to vCPU
> > x86/pv: untie issuing FLUSH_ROOT_PGTBL from XPTI
> > x86/mm: move FLUSH_ROOT_PGTBL handling before TLB flush
> > x86/spec-ctrl: introduce Address Space Isolation command line option
> > x86/mm: introduce per-vCPU L3 page-table
> > x86/mm: introduce a per-vCPU mapcache when using ASI
> > x86/pv: allow using a unique per-pCPU root page table (L4)
> > x86/mm: switch to a per-CPU mapped stack when using ASI
> > x86/mm: zero stack on context switch
> >
> > docs/misc/xen-command-line.pandoc | 24 +++
> > xen/arch/x86/cpu/mcheck/mce.c | 4 +
> > xen/arch/x86/domain.c | 157 +++++++++++----
> > xen/arch/x86/domain_page.c | 105 ++++++----
> > xen/arch/x86/flushtlb.c | 28 ++-
> > xen/arch/x86/hvm/hvm.c | 6 -
> > xen/arch/x86/include/asm/config.h | 16 +-
> > xen/arch/x86/include/asm/current.h | 58 +++++-
> > xen/arch/x86/include/asm/desc.h | 6 +-
> > xen/arch/x86/include/asm/domain.h | 50 +++--
> > xen/arch/x86/include/asm/flushtlb.h | 2 +-
> > xen/arch/x86/include/asm/mm.h | 15 +-
> > xen/arch/x86/include/asm/processor.h | 5 +
> > xen/arch/x86/include/asm/pv/mm.h | 5 +
> > xen/arch/x86/include/asm/smp.h | 12 ++
> > xen/arch/x86/include/asm/spec_ctrl.h | 4 +
> > xen/arch/x86/mm.c | 291 +++++++++++++++++++++------
> > xen/arch/x86/mm/hap/hap.c | 2 +-
> > xen/arch/x86/mm/paging.c | 6 +
> > xen/arch/x86/mm/shadow/hvm.c | 2 +-
> > xen/arch/x86/mm/shadow/multi.c | 2 +-
> > xen/arch/x86/pv/descriptor-tables.c | 47 ++---
> > xen/arch/x86/pv/dom0_build.c | 12 +-
> > xen/arch/x86/pv/domain.c | 57 ++++--
> > xen/arch/x86/pv/mm.c | 43 +++-
> > xen/arch/x86/setup.c | 32 ++-
> > xen/arch/x86/smp.c | 39 ++++
> > xen/arch/x86/smpboot.c | 26 ++-
> > xen/arch/x86/spec_ctrl.c | 205 ++++++++++++++++++-
> > xen/arch/x86/traps.c | 25 ++-
> > xen/arch/x86/x86_64/mm.c | 7 +-
> > xen/common/smp.c | 10 +
> > xen/common/stop_machine.c | 10 +
> > xen/include/xen/smp.h | 8 +
> > 34 files changed, 1052 insertions(+), 269 deletions(-)
>
> ... this diffstat (even after subtracting out the contribution of the last two
> patches in the series) doesn't really look like a cleanup / simplification.
To be fair you would need to subtract the contribution of the last 8
patches, as all those are strictly related to ASI. The perdomain
mapping interface cleanup is just the first 10 patches. Which leaves
a diffstat of:
xen/arch/x86/domain.c | 81 ++++++++++++----
xen/arch/x86/domain_page.c | 19 ++--
xen/arch/x86/hvm/hvm.c | 6 --
xen/arch/x86/include/asm/desc.h | 6 +-
xen/arch/x86/include/asm/domain.h | 13 +--
xen/arch/x86/include/asm/mm.h | 11 ++-
xen/arch/x86/include/asm/processor.h | 5 +
xen/arch/x86/mm.c | 175 ++++++++++++++++++++++++++---------
xen/arch/x86/pv/descriptor-tables.c | 47 +++++-----
xen/arch/x86/pv/domain.c | 24 ++---
xen/arch/x86/pv/mm.c | 3 +-
xen/arch/x86/smpboot.c | 6 +-
xen/arch/x86/traps.c | 12 +--
xen/arch/x86/x86_64/mm.c | 7 +-
14 files changed, 260 insertions(+), 155 deletions(-)
That's including the context switch change and not differentiating
between lines of code vs comments.
However, I don't think cleanup / simplifications should be purely
based on diffstat LoC. Arguably the current
create_perdomain_mapping() set of parameters are not the most obvious
ones:
int create_perdomain_mapping(struct domain *d, unsigned long va,
unsigned int nr, l1_pgentry_t **pl1tab,
struct page_info **ppg);
Compared to the result after the first 10 patches in the series:
int create_perdomain_mapping(struct vcpu *v, unsigned long va,
unsigned int nr, bool populate);
Together with the fact that callers no longer need to keep a reference
to the L1(s) tables to populate such area.
> Things becoming slightly slower (because of the L1 no longer directly available
> to modify) may, otoh, not be a significant issue, if we assume that GDT/LDT
> manipulation isn't normally a very frequent operation.
I introduce a fast path in both {populate,destroy}_perdomain_mapping()
that uses the recursive linear slot for manipulating the L1
page-table. There's still a slow path that relies on walking the
page-tables, but that should only be used when the vCPU is not
running, and hence the added latency shouldn't be too critical.
As a side-effect of this logic the pages allocated for the per-domain
region page-tables can now uniformly be from domheap. The usage of
xenheap pages for L1 page-tables is no longer needed once those are
not stashed in the domain structure anymore.
> IOW my earlier request stands: Can you please try to make more clear (in the
> patch descriptions) what exactly the motivation for these changes is? Just
> doing things differently with more code overall can't be it, I don't think.
The main motivation for the change is to remove stashing L1
page-tables for the per-domain area(s) in the domain struct, as with
the introduction of ASI Xen would need to stash L1 page-tables for the
per-domain area on the vcpu struct also, as a result of the per-domain
slot becoming per-vCPU. IMO managing such references, and having
logic to deal with domain and vcpu L1 page-tables is too complex and
error prone.
Instead I propose to add an interface:
{populate,destroy}_perdomain_mapping() that can be used to manage
mappings on the per-domain area with callers being completely unaware
of whether the domain is running with per-vCPU mappings or not.
Please let me know if you are happy with the reasoning and arguments
provided. I think the resulting perdomain mapping interface is much
better than what Xen currently does for manipulating per-domain
mapping entries, but I might be spoiled.
Thanks, Roger.
prev parent reply other threads:[~2025-01-17 14:46 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-08 14:26 [PATCH v2 00/18] x86: adventures in Address Space Isolation Roger Pau Monne
2025-01-08 14:26 ` [PATCH v2 01/18] x86/mm: purge unneeded destroy_perdomain_mapping() Roger Pau Monne
2025-01-08 15:59 ` Alejandro Vallejo
2025-01-08 14:26 ` [PATCH v2 02/18] x86/domain: limit window where curr_vcpu != current on context switch Roger Pau Monne
2025-01-08 16:26 ` Alejandro Vallejo
2025-01-09 17:39 ` Roger Pau Monné
2025-01-09 8:59 ` Jan Beulich
2025-01-09 17:33 ` Roger Pau Monné
2025-01-14 15:02 ` Jan Beulich
2025-01-17 14:57 ` Roger Pau Monné
2025-01-08 14:26 ` [PATCH v2 03/18] x86/mm: introduce helper to detect per-domain L1 entries that need freeing Roger Pau Monne
2025-01-09 9:03 ` Jan Beulich
2025-01-08 14:26 ` [PATCH v2 04/18] x86/pv: introduce function to populate perdomain area and use it to map Xen GDT Roger Pau Monne
2025-01-09 9:10 ` Jan Beulich
2025-01-10 14:15 ` Roger Pau Monné
2025-01-09 9:55 ` Alejandro Vallejo
2025-01-10 14:29 ` Roger Pau Monné
2025-01-10 15:50 ` Alejandro Vallejo
2025-01-08 14:26 ` [PATCH v2 05/18] x86/mm: switch destroy_perdomain_mapping() parameter from domain to vCPU Roger Pau Monne
2025-01-09 10:02 ` Alejandro Vallejo
2025-01-10 14:30 ` Roger Pau Monné
2025-01-08 14:26 ` [PATCH v2 06/18] x86/pv: set/clear guest GDT mappings using {populate,destroy}_perdomain_mapping() Roger Pau Monne
2025-01-08 15:11 ` [PATCH v2.1 " Roger Pau Monne
2025-01-09 10:25 ` Alejandro Vallejo
2025-01-10 14:33 ` Roger Pau Monné
2025-01-14 15:30 ` Jan Beulich
2025-01-08 14:26 ` [PATCH v2 07/18] x86/pv: update guest LDT mappings using the linear entries Roger Pau Monne
2025-01-09 14:34 ` Alejandro Vallejo
2025-01-10 14:44 ` Roger Pau Monné
2025-01-10 15:36 ` Alejandro Vallejo
2025-01-14 15:42 ` Jan Beulich
2025-01-08 14:26 ` [PATCH v2 08/18] x86/pv: remove stashing of GDT/LDT L1 page-tables Roger Pau Monne
2025-01-08 14:26 ` [PATCH v2 09/18] x86/mm: simplify create_perdomain_mapping() interface Roger Pau Monne
2025-01-09 11:01 ` Alejandro Vallejo
2025-01-10 14:45 ` Roger Pau Monné
2025-01-08 14:26 ` [PATCH v2 10/18] x86/mm: switch {create,destroy}_perdomain_mapping() domain parameter to vCPU Roger Pau Monne
2025-01-14 16:27 ` Jan Beulich
2025-01-08 14:26 ` [PATCH v2 11/18] x86/pv: untie issuing FLUSH_ROOT_PGTBL from XPTI Roger Pau Monne
2025-01-08 14:26 ` [PATCH v2 12/18] x86/mm: move FLUSH_ROOT_PGTBL handling before TLB flush Roger Pau Monne
2025-01-08 14:26 ` [PATCH v2 13/18] x86/spec-ctrl: introduce Address Space Isolation command line option Roger Pau Monne
2025-01-09 14:58 ` Alejandro Vallejo
2025-01-10 14:55 ` Roger Pau Monné
2025-01-10 15:51 ` Alejandro Vallejo
2025-01-08 14:26 ` [PATCH v2 14/18] x86/mm: introduce per-vCPU L3 page-table Roger Pau Monne
2025-01-08 14:26 ` [PATCH v2 15/18] x86/mm: introduce a per-vCPU mapcache when using ASI Roger Pau Monne
2025-01-09 15:08 ` Alejandro Vallejo
2025-01-10 15:02 ` Roger Pau Monné
2025-01-10 16:12 ` Alejandro Vallejo
2025-01-10 16:19 ` Alejandro Vallejo
2025-01-10 18:43 ` Roger Pau Monné
2025-01-08 14:26 ` [PATCH v2 16/18] x86/pv: allow using a unique per-pCPU root page table (L4) Roger Pau Monne
2025-01-08 14:26 ` [PATCH v2 17/18] x86/mm: switch to a per-CPU mapped stack when using ASI Roger Pau Monne
2025-01-08 14:26 ` [PATCH v2 18/18] x86/mm: zero stack on context switch Roger Pau Monne
2025-01-14 16:20 ` [PATCH v2 00/18] x86: adventures in Address Space Isolation Jan Beulich
2025-01-17 14:45 ` Roger Pau Monné [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z4ptJWmEwWdGsocD@macbook.local \
--to=roger.pau@citrix.com \
--cc=andrew.cooper3@citrix.com \
--cc=anthony.perard@vates.tech \
--cc=jbeulich@suse.com \
--cc=julien@xen.org \
--cc=michal.orzel@amd.com \
--cc=sstabellini@kernel.org \
--cc=tim@xen.org \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.