qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* page coloring and accelerated shadow paging
@ 2025-06-22  5:48 Michael Clark
  0 siblings, 0 replies; 3+ messages in thread
From: Michael Clark @ 2025-06-22  5:48 UTC (permalink / raw)
  To: qemu-devel, Richard Henderson, Paolo Bonzini, Peter Maydell

Hi QEMU Folks,

# background

I'm sending this out here because if I was a QEMU developer I'd like
to read about this, as it is informed by working on QEMU and other
simulators and emulators. this work is by-no-means complete. in fact,
it is just the beginning, but there is enough present for feedback.
in particular an early revision of the virtual memory system design.

# overview

a new system that is not yet in a simulator but documents some
interesting concepts such as canonicalized 'translation addresses'
possessing address space prefixes to ease meta-circular emulation of
physical and virtual memory. the core is designed to make shadow-paging
efficient as a foreign page table architecture, but there are
extended 'features' that would need to be in hardware to be fast.
the intention is that the address space prefixes are in registers to
address the "address space narrowing problem". there is also an
intention to add an as yet undefined MIPS-inspired TLB miss handler,
so that the page walker logic can be in software. this is to to make
adding support for foreign page table formats much easier. and when
translated to native code most of it will be cache miss latency anyway.

btw this started as a sketch in a gist in June 20th of last year:

https://gist.github.com/michaeljclark/8f9b81e5e40488035dc252c9da3ecc2e

# the glyph architecture

current: https://metaparadigm.com/~mclark/glyph.pdf
latest: https://metaparadigm.com/~mclark/glyph-20250622.pdf

a clean-slate, portable virtual machine designed for efficient binary
translation to X86 system mode targeting features like SMAP, SMEP,
APIC, TSC, and PCI-style message signaled interrupts, with a system
model that favors simplicity e.g., only supervisor/user modes and no
interrupt delegation. the fundamental architectural design elements,
such as the variable length instruction packet format and split
instruction and constant streams are in place, the 16-bit compressed
opcodes are fully specified, and there is now the beginnings of a
system or protected mode. yet there is still a lot of work for it to
virtualize a target like X86+AVX-512 with address translation.

## address translation

adopts a page table format desgined for shadow paging on X86. the page
translation system has a physical address permission check feature and
adds PTE.T (translate bit) for optional hardware zoning of translation
pages. in addition to virtual memory, the translation system performs
optional per-page physical permission checks, and physical self-mapping
validation with zoning for PTE pages that have the PTE.T bit set.

the architecture introduces the concept of a translation address which
are addresses boxed with an address space prefix (AS) designed to
provide a canonical address form for user and supervisor virtual
addresses as well as physical addresses, to make it easier to implement
meta-circular emulation for nested page translation with translation
agnostic source and destination address spaces.

## capabilities checks

adds overlay permissions for page table colors 'colorperms' which cause
faults during address translation. also 'colorcaps' and 'colormatrix'
add capability checks at execute time and are not part of page table
translation. colorcaps is used to control use of system instructions
based on page color. colormatrix is used to control source and target
page capabilities for loads, stores, and branches. goal: allow load,
store, and branch permission restrictions via source and destination
page color pair forwarding. requires the micro-architecture to track
source:target page colors for branch retirement permission checks.

## toolchain and debug strategy

developing three simulators (Python, C, Go) for cross-validation, with
plans for a Python assembler mainly to avoid binutils during early
bring-up. the assembler and linker need to support capabilities with
graph coloring to augment section permissions with section colors.
operating system kernels and runtime loaders will also need support.

Regards,
Michael.


^ permalink raw reply	[flat|nested] 3+ messages in thread

* page coloring and accelerated shadow paging
@ 2025-06-22 20:47 Michael Clark
  2025-06-23 20:43 ` Michael Clark
  0 siblings, 1 reply; 3+ messages in thread
From: Michael Clark @ 2025-06-22 20:47 UTC (permalink / raw)
  To: qemu-devel

Hi QEMU Folks,

# background

I'm sending this out here because if I was a QEMU developer I'd like
to read about this, as it is informed by working on QEMU and other
simulators and emulators. this work is by-no-means complete. in fact,
it is just the beginning, but there is enough present for feedback.
in particular an early revision of the virtual memory system design.

# overview

a new system that is not yet in a simulator but documents some
interesting concepts such as canonicalized 'translation addresses'
possessing address space prefixes to ease meta-circular emulation of
physical and virtual memory. the core is designed to make shadow-paging
efficient as a foreign page table architecture, but there are
extended 'features' that would need to be in hardware to be fast.
the intention is that the address space prefixes are in registers to
address the "address space narrowing problem". there is also an
intention to add an as yet undefined MIPS-inspired TLB miss handler,
so that the page walker logic can be in software. this is to to make
adding support for foreign page table formats much easier. and when
translated to native code most of it will be cache miss latency anyway.

btw this started as a sketch in a gist in June 20th of last year:

https://gist.github.com/michaeljclark/8f9b81e5e40488035dc252c9da3ecc2e

# the glyph architecture

current: https://metaparadigm.com/~mclark/glyph.pdf
latest: https://metaparadigm.com/~mclark/glyph-20250622.pdf

a clean-slate, portable virtual machine designed for efficient binary
translation to X86 system mode targeting features like SMAP, SMEP,
APIC, TSC, and PCI-style message signaled interrupts, with a system
model that favors simplicity e.g., only supervisor/user modes and no
interrupt delegation. the fundamental architectural design elements,
such as the variable length instruction packet format and split
instruction and constant streams are in place, the 16-bit compressed
opcodes are fully specified, and there is now the beginnings of a
system or protected mode. yet there is still a lot of work for it to
virtualize a target like X86+AVX-512 with address translation.

## address translation

adopts a page table format designed for shadow paging on X86. the page
translation system has a physical address permission check feature and
adds PTE.T (translate bit) for optional hardware zoning of translation
pages. in addition to virtual memory, the translation system performs
optional per-page physical permission checks, and physical self-mapping
validation with zoning for PTE pages that have the PTE.T bit set.

the architecture introduces the concept of a translation address which
are addresses boxed with an address space prefix (AS) designed to
provide a canonical address form for user and supervisor virtual
addresses as well as physical addresses, to make it easier to implement
meta-circular emulation for nested page translation with translation
agnostic source and destination address spaces.

## capabilities checks

adds overlay permissions for page table colors 'colorperms' which cause
faults during address translation. also 'colorcaps' and 'colormatrix'
add capability checks at execute time and are not part of page table
translation. colorcaps is used to control use of system instructions
based on page color. colormatrix is used to control source and target
page capabilities for loads, stores, and branches. goal: allow load,
store, and branch permission restrictions via source and destination
page color pair forwarding. requires the micro-architecture to track
source:target page colors for branch retirement permission checks.

## toolchain and debug strategy

developing three simulators (Python, C, Go) for cross-validation, with
plans for a Python assembler mainly to avoid binutils during early
bring-up. the assembler and linker need to support capabilities with
graph coloring to augment section permissions with section colors.
operating system kernels and runtime loaders will also need support.

Regards,
Michael.


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: page coloring and accelerated shadow paging
  2025-06-22 20:47 Michael Clark
@ 2025-06-23 20:43 ` Michael Clark
  0 siblings, 0 replies; 3+ messages in thread
From: Michael Clark @ 2025-06-23 20:43 UTC (permalink / raw)
  To: qemu-devel

Hi All,

On 6/23/25 08:47, Michael Clark wrote:
[snipped]

> btw this started as a sketch in a gist in June 20th of last year:
> 
> https://gist.github.com/michaeljclark/8f9b81e5e40488035dc252c9da3ecc2e
> 
> # the glyph architecture
> 
> current: https://metaparadigm.com/~mclark/glyph.pdf
> latest: https://metaparadigm.com/~mclark/glyph-20250622.pdf
> 
> a clean-slate, portable virtual machine designed for efficient binary
> translation to X86 system mode targeting features like SMAP, SMEP,
> APIC, TSC, and PCI-style message signaled interrupts, with a system
> model that favors simplicity e.g., only supervisor/user modes and no
> interrupt delegation. the fundamental architectural design elements,
> such as the variable length instruction packet format and split
> instruction and constant streams are in place, the 16-bit compressed
> opcodes are fully specified, and there is now the beginnings of a
> system or protected mode. yet there is still a lot of work for it to
> virtualize a target like X86+AVX-512 with address translation.
> 
> ## address translation
> 
> adopts a page table format designed for shadow paging on X86. the page
> translation system has a physical address permission check feature and
> adds PTE.T (translate bit) for optional hardware zoning of translation
> pages. in addition to virtual memory, the translation system performs
> optional per-page physical permission checks, and physical self-mapping
> validation with zoning for PTE pages that have the PTE.T bit set.
> 
> the architecture introduces the concept of a translation address which
> are addresses boxed with an address space prefix (AS) designed to
> provide a canonical address form for user and supervisor virtual
> addresses as well as physical addresses, to make it easier to implement
> meta-circular emulation for nested page translation with translation
> agnostic source and destination address spaces.

there were some errata related to the spec yesterday but the bones and
intention is there. as I haven't seen page tables deployed in this way
to check self maps in a physical tree; and the primary intent, which is
to set permission for PTE zones and page table page monitoring. so
there may be some more bugs when we try to test it out in a simulator.

- https://github.com/michaeljclark/glyph/
- https://metaparadigm.com/~mclark/glyph-20250623.pdf

errata:

- swap the supervisor and physical address space prefixes
   to be consistent with Linux kernel address space on x86-64.
- physical permission self-mapping check is only on leaf entries
- fix rename P bit to T bit in 64-bit page table entry structure

turns out that if we add an 'internal' AS it would be ..FFFE0000..
so we could think about adding an internal address space prefix.

Michael.


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-06-23 20:44 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-22  5:48 page coloring and accelerated shadow paging Michael Clark
  -- strict thread matches above, loose matches on Subject: below --
2025-06-22 20:47 Michael Clark
2025-06-23 20:43 ` Michael Clark

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).