From: Julien Grall <julien.grall@arm.com>
To: Sergej Proskurin <proskurin@sec.in.tum.de>,
xen-devel@lists.xenproject.org
Cc: Stefano Stabellini <sstabellini@kernel.org>
Subject: Re: [PATCH v4 8/9] arm/mem_access: Add short-descriptor based gpt
Date: Thu, 22 Jun 2017 14:53:21 +0100 [thread overview]
Message-ID: <dd0f8b2b-3072-f729-ddb7-ad2c969a7d5f@arm.com> (raw)
In-Reply-To: <20170620203332.17833-9-proskurin@sec.in.tum.de>
Hi Sergej,
On 20/06/17 21:33, Sergej Proskurin wrote:
> This commit adds functionality to walk the guest's page tables using the
> short-descriptor translation table format for both ARMv7 and ARMv8. The
> implementation is based on ARM DDI 0487B-a J1-6002 and ARM DDI 0406C-b
> B3-1506.
>
> Signed-off-by: Sergej Proskurin <proskurin@sec.in.tum.de>
> ---
> Cc: Stefano Stabellini <sstabellini@kernel.org>
> Cc: Julien Grall <julien.grall@arm.com>
> ---
> v3: Move the implementation to ./xen/arch/arm/guest_copy.c.
>
> Use defines instead of hardcoded values.
>
> Cosmetic fixes & Added more coments.
>
> v4: Adjusted the names of short-descriptor data-types.
>
> Adapt the function to the new parameter of type "struct vcpu *".
>
> Cosmetic fixes.
> ---
> xen/arch/arm/guest_walk.c | 167 ++++++++++++++++++++++++++++++++++++++-
> xen/include/asm-arm/guest_walk.h | 16 ++++
> 2 files changed, 181 insertions(+), 2 deletions(-)
>
> diff --git a/xen/arch/arm/guest_walk.c b/xen/arch/arm/guest_walk.c
> index c37c595157..9cc167af49 100644
> --- a/xen/arch/arm/guest_walk.c
> +++ b/xen/arch/arm/guest_walk.c
> @@ -18,6 +18,7 @@
> #include <xen/sched.h>
> #include <xen/domain_page.h>
> #include <asm/guest_walk.h>
> +#include <asm/short-desc.h>
>
> /*
> * The function guest_walk_sd translates a given GVA into an IPA using the
> @@ -30,8 +31,170 @@ static int guest_walk_sd(const struct vcpu *v,
> vaddr_t gva, paddr_t *ipa,
> unsigned int *perms)
> {
> - /* Not implemented yet. */
> - return -EFAULT;
> + bool disabled = true;
> + int64_t ttbr;
TTBR is a register. It cannot be signed.
> + paddr_t mask, paddr;
> + short_desc_t pte, *table;
> + struct page_info *page;
> + register_t ttbcr = READ_SYSREG(TCR_EL1);
> + unsigned int level = 0, n = ttbcr & TTBCR_N_MASK;
> + struct domain *d = v->domain;
> +
> + const paddr_t offsets[2] = {
Why are you using paddr_t here?
Looking at the code, I see very limited point of having the offsets
array as you don't use a loop and also use each offset in a single place.
> + ((paddr_t)(gva >> 20) & ((1ULL << (12 - n)) - 1)),
Why the cast? After gva is a virtual address not physical one.
Also, this code is a bit cryptic to read. Can we have some documentation
at least?
Furthermore, it would be clearer if you first mask then shift. As it
helps to understand the code.
If you use GENMASK (or GENMASK_ULL if you don't extend GENMASK), this
will make everything more obvious:
(gva & GENMASK(31 - n, 20)) >> 20
> + ((paddr_t)(gva >> 12) & ((1ULL << 8) - 1))
(gva & GENMASK(20, 12)) >> 12
> + };
> +
> + mask = ((1ULL << BITS_PER_WORD) - 1) &
> + ~((1ULL << (BITS_PER_WORD - n)) - 1);
Same remark as on the previous patch for BITS_PER_WORD + you could use
GENMASK.
> +
> + if ( n == 0 || !(gva & mask) )
> + {
> + /* Use TTBR0 for GVA to IPA translation. */
> + ttbr = READ_SYSREG64(TTBR0_EL1);
Looking at the documentation. For short-descriptor, the register will be
32-bit. Whilst I can understand why you use READ_SYSREG64, you should at
least document it. You also probably want to have ttbr uint32_t in that
case.
> +
> + /* If TTBCR.PD0 is set, translations using TTBR0 are disabled. */
> + disabled = ttbcr & TTBCR_PD0;
> + }
> + else
> + {
> + /* Use TTBR1 for GVA to IPA translation. */
> + ttbr = READ_SYSREG64(TTBR1_EL1);
Ditto.
> +
> + /* If TTBCR.PD1 is set, translations using TTBR1 are disabled. */
> + disabled = ttbcr & TTBCR_PD1;
> +
> + /*
> + * TTBR1 translation always works like n==0 TTBR0 translation (ARM DDI
> + * 0487B.a J1-6003).
> + */
> + n = 0;
> + }
> +
> + if ( disabled )
> + return -EFAULT;
> +
> + /*
> + * The address of the descriptor for the initial lookup has the following
> + * format: [ttbr<31:14-n>:gva<31-n:20>:00] (ARM DDI 0487B.a J1-6003). In
> + * this way, the first lookup level might comprise up to four consecutive
> + * pages. To avoid mapping all of the pages, we simply map the page that is
> + * needed by the first level translation by incorporating up to 2 MSBs of
> + * the GVA.
> + */
> + mask = (1ULL << (14 - n)) - 1;
mask = GENMASK(31, 14);
> + paddr = (ttbr & ~mask) | (offsets[level] << 2);
> +
> + page = get_page_from_gfn(d, paddr_to_pfn(paddr), NULL, P2M_ALLOC);
> + if ( !page )
> + return -EFAULT;
> +
> + table = __map_domain_page(page);
> +
> + /*
> + * Consider that the first level address translation does not need to be
> + * page-aligned if n > 2.
> + */
> + if ( n > 2 )
> + {
> + /* Make sure that we consider the bits ttbr<12:14-n> if n > 2. */
> + mask = ((1ULL << 12) - 1) & ~((1ULL << (14 - n)) - 1);
GENMASK(12, 14 - n);
> + table = (short_desc_t *)((unsigned long)table | (unsigned long)(ttbr & mask));
> + }
> +
> + /*
> + * As we have considered up to 2 MSBs of the GVA for mapping the first
> + * level translation table, we need to make sure that we limit the table
> + * offset that is is indexed by GVA<31-n:20> to max 10 bits to avoid
> + * exceeding the page size limit.
> + */
> + mask = ((1ULL << 10) - 1);
> + pte = table[offsets[level] & mask];
This looks quite complex for just reading a pte. I think it would be
worth to leverage the vgic_guest_access_memory for that (same in LPAE).
It would also add safety if the offsets the table is mistakenly computed
(from the current code, I can't convince myself the offset will always
be right).
> +
> + unmap_domain_page(table);
> + put_page(page);
> +
> + switch ( pte.walk.dt )
> + {
> + case L1DESC_INVALID:
> + return -EFAULT;
> +
> + case L1DESC_PAGE_TABLE:
> + level++;
> +
> + page = get_page_from_gfn(d, (pte.walk.base >> 2), NULL, P2M_ALLOC);
> + if ( !page )
> + return -EFAULT;
> +
> + table = __map_domain_page(page);
> + /*
> + * The second level translation table is addressed by PTE<31:10>. Hence
> + * it does not need to be page aligned. Make sure that we also consider
> + * the bits PTE<11:10>.
> + */
> + table = (short_desc_t *)((unsigned long)table | ((pte.walk.base & 0x3) << 10));
> +
> + pte = table[offsets[level]];
Ditto.
> +
> + unmap_domain_page(table);
> + put_page(page);
> +
> + if ( pte.walk.dt == L2DESC_INVALID )
> + return -EFAULT;
> +
> + if ( pte.pg.page ) /* Small page. */
> + {
> + mask = (1ULL << PAGE_SHIFT_4K) - 1;
> +
> + *ipa = (pte.pg.base << PAGE_SHIFT_4K) | (gva & mask);
> +
> + /* Set execute permissions associated with the small page. */
> + if ( !pte.pg.xn )
> + *perms = GV2M_EXEC;
As perms is modified in few places over the code. I would prefer if you
first reset *perms at suitable value in guest_walk_tables and the use |=
everywhere.
This would avoid any mistake in the future where the permission is not
reported correctly.
> + }
> + else /* Large page. */
> + {
> + mask = (1ULL << PAGE_SHIFT_64K) - 1;
> +
> + *ipa = (pte.lpg.base << PAGE_SHIFT_64K) | (gva & mask);
> +
> + /* Set execute permissions associated with the large page. */
> + if ( !pte.lpg.xn )
> + *perms = GV2M_EXEC;
See above.
> + }
> +
> + /* Set permissions so that the caller can check the flags by herself. */
> + if ( !pte.pg.ro )
> + *perms |= GV2M_WRITE;
> +
> + break;
> +
> + case L1DESC_SECTION:
> + case L1DESC_SECTION_PXN:
> + if ( !pte.sec.supersec ) /* Section */
> + {
> + mask = (1ULL << L1DESC_SECTION_SHIFT) - 1;
It is quite strange that above you use plain PAGE_SHIFT_4K which is not
really related to short-descriptor (you define it in lpae.h) and here
you use short-descriptor name. Please try to stay consistent.
> +
> + *ipa = (pte.sec.base << L1DESC_SECTION_SHIFT) | (gva & mask);
> + }
> + else /* Supersection */
> + {
> + mask = (1ULL << L1DESC_SUPERSECTION_SHIFT) - 1;
> +
> + *ipa = gva & mask;
> + *ipa |= (paddr_t)(pte.supersec.base) << L1DESC_SUPERSECTION_SHIFT;
> + *ipa |= (paddr_t)(pte.supersec.extbase1) << L1DESC_SUPERSECTION_EXT_BASE1_SHIFT;
> + *ipa |= (paddr_t)(pte.supersec.extbase2) << L1DESC_SUPERSECTION_EXT_BASE2_SHIFT;
> + }
> +
> + /* Set permissions so that the caller can check the flags by herself. */
> + if ( !pte.sec.ro )
> + *perms = GV2M_WRITE;
My suggestion about perms would also avoid undefined permission if the
region is read-only as none of the callers today will initialize perms.
> + if ( !pte.sec.xn )
> + *perms |= GV2M_EXEC;
> + }
> +
> + return 0;
> }
>
> /*
> diff --git a/xen/include/asm-arm/guest_walk.h b/xen/include/asm-arm/guest_walk.h
> index 4ed8476e08..d0bed0c7a2 100644
> --- a/xen/include/asm-arm/guest_walk.h
> +++ b/xen/include/asm-arm/guest_walk.h
> @@ -1,6 +1,22 @@
> #ifndef _XEN_GUEST_WALK_H
> #define _XEN_GUEST_WALK_H
>
> +/* First level translation table descriptor types used by the AArch32
> + * short-descriptor translation table format. */
> +#define L1DESC_INVALID (0)
> +#define L1DESC_PAGE_TABLE (1)
> +#define L1DESC_SECTION (2)
> +#define L1DESC_SECTION_PXN (3)
> +
> +/* Defines for section and supersection shifts. */
> +#define L1DESC_SECTION_SHIFT (20)
> +#define L1DESC_SUPERSECTION_SHIFT (24)
> +#define L1DESC_SUPERSECTION_EXT_BASE1_SHIFT (32)
> +#define L1DESC_SUPERSECTION_EXT_BASE2_SHIFT (36)
> +
> +/* Second level translation table descriptor types. */
> +#define L2DESC_INVALID (0)
I think those one belongs to short-desc.h.
Cheers,
--
Julien Grall
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
next prev parent reply other threads:[~2017-06-22 13:53 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-06-20 20:33 [PATCH v4 0/9] arm/mem_access: Walk guest page tables in SW if mem_access is active Sergej Proskurin
2017-06-20 20:33 ` [PATCH v4 1/9] arm/mem_access: Add (TCR_|TTBCR_)* defines Sergej Proskurin
2017-06-22 10:42 ` Julien Grall
2017-06-20 20:33 ` [PATCH v4 2/9] arm/mem_access: Add defines supporting PTs with varying page sizes Sergej Proskurin
2017-06-22 10:52 ` Julien Grall
2017-06-20 20:33 ` [PATCH v4 3/9] arm/mem_access: Add short-descriptor pte typedefs Sergej Proskurin
2017-07-04 16:22 ` Julien Grall
2017-07-04 16:24 ` Julien Grall
2017-06-20 20:33 ` [PATCH v4 4/9] arm/mem_access: Introduce GV2M_EXEC permission Sergej Proskurin
2017-06-20 20:33 ` [PATCH v4 5/9] arm/mem_access: Extend BIT-operations to unsigned long long Sergej Proskurin
2017-06-22 11:13 ` Julien Grall
2017-06-20 20:33 ` [PATCH v4 6/9] arm/mem_access: Add software guest-page-table walk Sergej Proskurin
2017-06-22 11:16 ` Julien Grall
2017-06-22 11:36 ` Sergej Proskurin
2017-06-20 20:33 ` [PATCH v4 7/9] arm/mem_access: Add long-descriptor based gpt Sergej Proskurin
2017-06-22 12:12 ` Julien Grall
2017-06-23 14:23 ` Sergej Proskurin
2017-06-23 14:35 ` Julien Grall
2017-06-22 13:54 ` Julien Grall
2017-06-20 20:33 ` [PATCH v4 8/9] arm/mem_access: Add short-descriptor " Sergej Proskurin
2017-06-22 13:53 ` Julien Grall [this message]
2017-06-23 19:09 ` Sergej Proskurin
2017-06-23 20:46 ` Julien Grall
2017-06-26 7:57 ` Sergej Proskurin
2017-06-20 20:33 ` [PATCH v4 9/9] arm/mem_access: Walk the guest's pt in software Sergej Proskurin
2017-06-20 20:44 ` Tamas K Lengyel
2017-06-20 20:59 ` Sergej Proskurin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=dd0f8b2b-3072-f729-ddb7-ad2c969a7d5f@arm.com \
--to=julien.grall@arm.com \
--cc=proskurin@sec.in.tum.de \
--cc=sstabellini@kernel.org \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).