xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/3] xen/arm: Add stage 2 48-bit PA support
@ 2014-05-27  6:46 vijay.kilari
  2014-05-27  6:46 ` [PATCH v2 1/3] xen/arm: Add 4-level page table for stage 2 translation vijay.kilari
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: vijay.kilari @ 2014-05-27  6:46 UTC (permalink / raw)
  To: Ian.Campbell, julien.grall, stefano.stabellini,
	stefano.stabellini, xen-devel
  Cc: Prasun.Kapoor, Vijaya Kumar K, vijay.kilari

From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>

Add 4-level page tables for stage 2 translation to
support 48-bit physical address range.

Changes in v2:
 - Moved VTCR register declarations from page.h to processor.h file
 - Fixed coding style comments
 - Added patch(2) to handle page table walk with 4 levels
 - Added seperate patch to remove unused VADDR{BITS,MASK} macros

Changes in v1:
 - Initial version

Vijaya Kumar K (3):
  xen/arm: Add 4-level page table for stage 2 translation
  xen/arm: update page table walk to handle 4 level page table
  xen/arm: remove unused VADDR_BITS and VADDR_MASK macros

 xen/arch/arm/arm64/head.S       |   14 ++--
 xen/arch/arm/mm.c               |   37 +++++++----
 xen/arch/arm/p2m.c              |  136 +++++++++++++++++++++++++++++++++------
 xen/include/asm-arm/p2m.h       |    5 +-
 xen/include/asm-arm/page.h      |   19 +++---
 xen/include/asm-arm/processor.h |  102 ++++++++++++++++++++++++++++-
 6 files changed, 265 insertions(+), 48 deletions(-)

-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v2 1/3] xen/arm: Add 4-level page table for stage 2 translation
  2014-05-27  6:46 [PATCH v2 0/3] xen/arm: Add stage 2 48-bit PA support vijay.kilari
@ 2014-05-27  6:46 ` vijay.kilari
  2014-05-28 14:29   ` Ian Campbell
  2014-07-15 13:47   ` Ian Campbell
  2014-05-27  6:46 ` [PATCH v2 2/3] xen/arm: update page table walk to handle 4 level page table vijay.kilari
  2014-05-27  6:46 ` [PATCH v2 3/3] xen/arm: remove unused VADDR_BITS and VADDR_MASK macros vijay.kilari
  2 siblings, 2 replies; 11+ messages in thread
From: vijay.kilari @ 2014-05-27  6:46 UTC (permalink / raw)
  To: Ian.Campbell, julien.grall, stefano.stabellini,
	stefano.stabellini, xen-devel
  Cc: Prasun.Kapoor, Vijaya Kumar K, vijay.kilari

From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>

To support 48-bit Physical Address support, add 4-level
page tables for stage 2 translation.
With this patch stage 1 and stage 2 translation at EL2 are
with 4-levels

Configure TCR_EL2.IPS and VTCR_EL2.PS based on platform
supported PA range at runtime

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
---
 xen/arch/arm/arm64/head.S       |   14 ++--
 xen/arch/arm/mm.c               |   18 +++---
 xen/arch/arm/p2m.c              |  136 +++++++++++++++++++++++++++++++++------
 xen/include/asm-arm/p2m.h       |    5 +-
 xen/include/asm-arm/page.h      |   16 +++--
 xen/include/asm-arm/processor.h |  102 ++++++++++++++++++++++++++++-
 6 files changed, 248 insertions(+), 43 deletions(-)

diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
index 2a13527..8396268 100644
--- a/xen/arch/arm/arm64/head.S
+++ b/xen/arch/arm/arm64/head.S
@@ -224,13 +224,13 @@ skip_bss:
         ldr   x0, =MAIRVAL
         msr   mair_el2, x0
 
-        /* Set up the HTCR:
-         * PASize -- 40 bits / 1TB
-         * Top byte is used
-         * PT walks use Inner-Shareable accesses,
-         * PT walks are write-back, write-allocate in both cache levels,
-         * Full 64-bit address space goes through this table. */
-        ldr   x0, =0x80823500
+        mrs   x1, ID_AA64MMFR0_EL1
+
+        /* Set up the HTCR */
+        ldr   x0, =TCR_VAL_BASE
+
+        /* Set TCR_EL2.IPS based on ID_AA64MMFR0_EL1.PARange */
+        bfi   x0, x1, #16, #3
         msr   tcr_el2, x0
 
         /* Set up the SCTLR_EL2:
diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index eac228c..04e3182 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -377,17 +377,17 @@ void __init arch_init_memory(void)
 void __cpuinit setup_virt_paging(void)
 {
     /* Setup Stage 2 address translation */
-    /* SH0=11 (Inner-shareable)
-     * ORGN0=IRGN0=01 (Normal memory, Write-Back Write-Allocate Cacheable)
-     * SL0=01 (Level-1)
-     * ARVv7: T0SZ=(1)1000 = -8 (32-(-8) = 40 bit physical addresses)
-     * ARMv8: T0SZ=01 1000 = 24 (64-24   = 40 bit physical addresses)
-     *        PS=010 == 40 bits
-     */
 #ifdef CONFIG_ARM_32
-    WRITE_SYSREG32(0x80003558, VTCR_EL2);
+    WRITE_SYSREG32(VTCR_VAL_BASE, VTCR_EL2);
 #else
-    WRITE_SYSREG32(0x80023558, VTCR_EL2);
+    /* Update IPA 48 bit and PA 48 bit */
+    if ( current_cpu_data.mm64.pa_range == VTCR_PS_48BIT_VAL )
+        WRITE_SYSREG32(VTCR_VAL_BASE | VTCR_TOSZ_48BIT | VTCR_PS_48BIT,
+                       VTCR_EL2);
+    else
+        /* default to IPA 48 bit and PA 40 bit */
+        WRITE_SYSREG32(VTCR_VAL_BASE | VTCR_TOSZ_40BIT | VTCR_PS_40BIT,
+                       VTCR_EL2);
 #endif
     isb();
 }
diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 603c097..045c003 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -10,29 +10,42 @@
 #include <asm/hardirq.h>
 #include <asm/page.h>
 
+#ifdef CONFIG_ARM_64
+/* Zeroeth level is of 1 page size */
+#define P2M_ROOT_ORDER 0
+#else
 /* First level P2M is 2 consecutive pages */
-#define P2M_FIRST_ORDER 1
-#define P2M_FIRST_ENTRIES (LPAE_ENTRIES<<P2M_FIRST_ORDER)
+#define P2M_ROOT_ORDER 1
+#endif
+#define P2M_FIRST_ENTRIES (LPAE_ENTRIES << P2M_ROOT_ORDER)
 
 void dump_p2m_lookup(struct domain *d, paddr_t addr)
 {
     struct p2m_domain *p2m = &d->arch.p2m;
-    lpae_t *first;
+    lpae_t *lookup;
 
     printk("dom%d IPA 0x%"PRIpaddr"\n", d->domain_id, addr);
 
-    if ( first_linear_offset(addr) > LPAE_ENTRIES )
+#ifdef CONFIG_ARM_64
+    if ( zeroeth_linear_offset(addr) > P2M_FIRST_ENTRIES )
+    {
+        printk("Beyond number of support entries at zeroeth level\n");
+        return;
+    }
+#else
+    if ( first_linear_offset(addr) > P2M_FIRST_ENTRIES )
     {
         printk("Cannot dump addresses in second of first level pages...\n");
         return;
     }
+#endif
 
     printk("P2M @ %p mfn:0x%lx\n",
-           p2m->first_level, page_to_mfn(p2m->first_level));
+           p2m->root_level, page_to_mfn(p2m->root_level));
 
-    first = __map_domain_page(p2m->first_level);
-    dump_pt_walk(first, addr);
-    unmap_domain_page(first);
+    lookup = __map_domain_page(p2m->root_level);
+    dump_pt_walk(lookup, addr);
+    unmap_domain_page(lookup);
 }
 
 static void p2m_load_VTTBR(struct domain *d)
@@ -72,6 +85,20 @@ void p2m_restore_state(struct vcpu *n)
     isb();
 }
 
+#ifdef CONFIG_ARM_64
+/*
+ * Map zeroeth level page that addr contains.
+ */
+static lpae_t *p2m_map_zeroeth(struct p2m_domain *p2m, paddr_t addr)
+{
+    if ( zeroeth_linear_offset(addr) >= LPAE_ENTRIES )
+        return NULL;
+
+    return __map_domain_page(p2m->root_level);
+}
+
+#else
+
 static int p2m_first_level_index(paddr_t addr)
 {
     /*
@@ -92,10 +119,11 @@ static lpae_t *p2m_map_first(struct p2m_domain *p2m, paddr_t addr)
     if ( first_linear_offset(addr) >= P2M_FIRST_ENTRIES )
         return NULL;
 
-    page = p2m->first_level + p2m_first_level_index(addr);
+    page = p2m->root_level + p2m_first_level_index(addr);
 
     return __map_domain_page(page);
 }
+#endif
 
 /*
  * Lookup the MFN corresponding to a domain's PFN.
@@ -107,6 +135,9 @@ paddr_t p2m_lookup(struct domain *d, paddr_t paddr, p2m_type_t *t)
 {
     struct p2m_domain *p2m = &d->arch.p2m;
     lpae_t pte, *first = NULL, *second = NULL, *third = NULL;
+#ifdef CONFIG_ARM_64
+    lpae_t *zeroeth = NULL;
+#endif
     paddr_t maddr = INVALID_PADDR;
     p2m_type_t _t;
 
@@ -117,9 +148,26 @@ paddr_t p2m_lookup(struct domain *d, paddr_t paddr, p2m_type_t *t)
 
     spin_lock(&p2m->lock);
 
+#ifdef CONFIG_ARM_64
+    zeroeth = p2m_map_zeroeth(p2m, paddr);
+    if ( !zeroeth )
+        goto err;
+
+    pte = zeroeth[zeroeth_table_offset(paddr)];
+    /* Zeroeth level does not support block translation
+     * so pte.p2m.table should be always set.
+     * Just check for valid bit
+     */
+    if ( !pte.p2m.valid )
+        goto done;
+
+    /* Map first level table */
+    first = map_domain_page(pte.p2m.base);
+#else
     first = p2m_map_first(p2m, paddr);
     if ( !first )
         goto err;
+#endif
 
     pte = first[first_table_offset(paddr)];
     if ( !pte.p2m.valid || !pte.p2m.table )
@@ -148,6 +196,9 @@ done:
     if (third) unmap_domain_page(third);
     if (second) unmap_domain_page(second);
     if (first) unmap_domain_page(first);
+#ifdef CONFIG_ARM_64
+    if (zeroeth) unmap_domain_page(zeroeth);
+#endif
 
 err:
     spin_unlock(&p2m->lock);
@@ -286,8 +337,14 @@ static int apply_p2m_changes(struct domain *d,
     struct p2m_domain *p2m = &d->arch.p2m;
     lpae_t *first = NULL, *second = NULL, *third = NULL;
     paddr_t addr;
-    unsigned long cur_first_page = ~0,
-                  cur_first_offset = ~0,
+#ifdef CONFIG_ARM_64
+    lpae_t *zeroeth = NULL;
+    unsigned long cur_zeroeth_page = ~0,
+                  cur_zeroeth_offset = ~0;
+#else
+    unsigned long cur_first_page = ~0;
+#endif
+    unsigned long cur_first_offset = ~0,
                   cur_second_offset = ~0;
     unsigned long count = 0;
     unsigned int flush = 0;
@@ -299,6 +356,44 @@ static int apply_p2m_changes(struct domain *d,
     addr = start_gpaddr;
     while ( addr < end_gpaddr )
     {
+#ifdef CONFIG_ARM_64
+        /* Find zeroeth offset and map zeroeth page */
+        if ( cur_zeroeth_page != zeroeth_table_offset(addr) )
+        {
+            if ( zeroeth ) unmap_domain_page(zeroeth);
+            zeroeth = p2m_map_zeroeth(p2m, addr);
+            if ( !zeroeth )
+            {
+                rc = -EINVAL;
+                goto out;
+            }
+            cur_zeroeth_page = zeroeth_table_offset(addr);
+        }
+
+        if ( !zeroeth[zeroeth_table_offset(addr)].p2m.valid )
+        {
+            if ( !populate )
+            {
+                addr = (addr + ZEROETH_SIZE) & ZEROETH_MASK;
+                continue;
+            }
+            rc = p2m_create_table(d, &zeroeth[zeroeth_table_offset(addr)]);
+            if ( rc < 0 )
+            {
+                printk("p2m_populate_ram: L0 failed\n");
+                goto out;
+            }
+        } 
+
+        BUG_ON(!zeroeth[zeroeth_table_offset(addr)].p2m.valid);
+
+        if ( cur_zeroeth_offset != zeroeth_table_offset(addr) )
+        {
+            if ( first ) unmap_domain_page(first);
+            first = map_domain_page(zeroeth[zeroeth_table_offset(addr)].p2m.base);
+            cur_zeroeth_offset = zeroeth_table_offset(addr);
+        }
+#else
         if ( cur_first_page != p2m_first_level_index(addr) )
         {
             if ( first ) unmap_domain_page(first);
@@ -310,7 +405,7 @@ static int apply_p2m_changes(struct domain *d,
             }
             cur_first_page = p2m_first_level_index(addr);
         }
-
+#endif
         if ( !first[first_table_offset(addr)].p2m.valid )
         {
             if ( !populate )
@@ -479,6 +574,9 @@ out:
     if (third) unmap_domain_page(third);
     if (second) unmap_domain_page(second);
     if (first) unmap_domain_page(first);
+#ifdef CONFIG_ARM_64
+    if ( zeroeth ) unmap_domain_page(zeroeth);
+#endif
 
     spin_unlock(&p2m->lock);
 
@@ -530,7 +628,7 @@ int p2m_alloc_table(struct domain *d)
     struct page_info *page;
     void *p;
 
-    page = alloc_domheap_pages(NULL, P2M_FIRST_ORDER, 0);
+    page = alloc_domheap_pages(NULL, P2M_ROOT_ORDER, 0);
     if ( page == NULL )
         return -ENOMEM;
 
@@ -541,13 +639,15 @@ int p2m_alloc_table(struct domain *d)
     clear_page(p);
     unmap_domain_page(p);
 
+#ifdef CONFIG_ARM_32
     p = __map_domain_page(page + 1);
     clear_page(p);
     unmap_domain_page(p);
+#endif
 
-    p2m->first_level = page;
+    p2m->root_level = page;
 
-    d->arch.vttbr = page_to_maddr(p2m->first_level)
+    d->arch.vttbr = page_to_maddr(p2m->root_level)
         | ((uint64_t)p2m->vmid&0xff)<<48;
 
     p2m_load_VTTBR(d);
@@ -628,9 +728,9 @@ void p2m_teardown(struct domain *d)
     while ( (pg = page_list_remove_head(&p2m->pages)) )
         free_domheap_page(pg);
 
-    free_domheap_pages(p2m->first_level, P2M_FIRST_ORDER);
+    free_domheap_pages(p2m->root_level, P2M_ROOT_ORDER);
 
-    p2m->first_level = NULL;
+    p2m->root_level = NULL;
 
     p2m_free_vmid(d);
 
@@ -654,7 +754,7 @@ int p2m_init(struct domain *d)
 
     d->arch.vttbr = 0;
 
-    p2m->first_level = NULL;
+    p2m->root_level = NULL;
 
     p2m->max_mapped_gfn = 0;
     p2m->lowest_mapped_gfn = ULONG_MAX;
diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
index bd71abe..c33fc4d 100644
--- a/xen/include/asm-arm/p2m.h
+++ b/xen/include/asm-arm/p2m.h
@@ -13,8 +13,9 @@ struct p2m_domain {
     /* Pages used to construct the p2m */
     struct page_list_head pages;
 
-    /* Root of p2m page tables, 2 contiguous pages */
-    struct page_info *first_level;
+    /* ARMv7: Root of p2m page tables, 2 contiguous pages */
+    /* ARMv8: Look up table is zeroeth level */
+    struct page_info *root_level;
 
     /* Current VMID in use */
     uint8_t vmid;
diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h
index c38e9c9..31c08b4 100644
--- a/xen/include/asm-arm/page.h
+++ b/xen/include/asm-arm/page.h
@@ -6,7 +6,11 @@
 #include <public/xen.h>
 #include <asm/processor.h>
 
+#ifdef CONFIG_ARM_64
+#define PADDR_BITS              48
+#else
 #define PADDR_BITS              40
+#endif
 #define PADDR_MASK              ((1ULL << PADDR_BITS)-1)
 
 #define VADDR_BITS              32
@@ -110,8 +114,8 @@ typedef struct __packed {
     unsigned long ng:1;         /* Not-Global */
 
     /* The base address must be appropriately aligned for Block entries */
-    unsigned long base:28;      /* Base address of block or next table */
-    unsigned long sbz:12;       /* Must be zero */
+    unsigned long base:36;      /* Base address of block or next table */
+    unsigned long sbz:4;       /* Must be zero */
 
     /* These seven bits are only used in Block entries and are ignored
      * in Table entries. */
@@ -145,8 +149,8 @@ typedef struct __packed {
     unsigned long sbz4:1;
 
     /* The base address must be appropriately aligned for Block entries */
-    unsigned long base:28;      /* Base address of block or next table */
-    unsigned long sbz3:12;
+    unsigned long base:36;      /* Base address of block or next table */
+    unsigned long sbz3:4;
 
     /* These seven bits are only used in Block entries and are ignored
      * in Table entries. */
@@ -170,9 +174,9 @@ typedef struct __packed {
     unsigned long pad2:10;
 
     /* The base address must be appropriately aligned for Block entries */
-    unsigned long base:28;      /* Base address of block or next table */
+    unsigned long base:36;      /* Base address of block or next table */
 
-    unsigned long pad1:24;
+    unsigned long pad1:16;
 } lpae_walk_t;
 
 typedef union {
diff --git a/xen/include/asm-arm/processor.h b/xen/include/asm-arm/processor.h
index 5978b8a..23c2f66 100644
--- a/xen/include/asm-arm/processor.h
+++ b/xen/include/asm-arm/processor.h
@@ -31,6 +31,95 @@
 #define MPIDR_AFFINITY_LEVEL(mpidr, level) \
          ((mpidr >> MPIDR_LEVEL_SHIFT(level)) & MPIDR_LEVEL_MASK)
 
+/* 
+ * VTCR register configuration for stage 2 translation
+ */
+#define VTCR_T0SZ_SHIFT   0
+#define VTCR_TOSZ_40BIT  (24 << VTCR_T0SZ_SHIFT)
+#define VTCR_TOSZ_48BIT  (16 << VTCR_T0SZ_SHIFT)
+
+#define VTCR_SL0_SHIFT    6
+#define VTCR_SL0_0       (0x2 << VTCR_SL0_SHIFT)
+#define VTCR_SL0_1       (0x1 << VTCR_SL0_SHIFT)
+#define VTCR_SL0_2       (0x0 << VTCR_SL0_SHIFT)
+
+#define VTCR_IRGN0_SHIFT  8
+#define VTCR_IRGN0_NC    (0x0 << VTCR_IRGN0_SHIFT)
+#define VTCR_IRGN0_WBWA  (0x1 << VTCR_IRGN0_SHIFT)
+#define VTCR_IRGN0_WT    (0x2 << VTCR_IRGN0_SHIFT)
+#define VTCR_IRGN0_WB    (0x3 << VTCR_IRGN0_SHIFT)
+
+#define VTCR_ORGN0_SHIFT  10
+#define VTCR_ORGN0_NC    (0x0 << VTCR_ORGN0_SHIFT)
+#define VTCR_ORGN0_WBWA  (0x1 << VTCR_ORGN0_SHIFT)
+#define VTCR_ORGN0_WT    (0x2 << VTCR_ORGN0_SHIFT)
+#define VTCR_ORGN0_WB    (0x3 << VTCR_ORGN0_SHIFT)
+
+#define VTCR_SH0_SHIFT    12
+#define VTCR_SH0_NS      (0x0 << VTCR_SH0_SHIFT)
+#define VTCR_SH0_OS      (0x2 << VTCR_SH0_SHIFT)
+#define VTCR_SH0_IS      (0x3 << VTCR_SH0_SHIFT)
+
+#define VTCR_TG0_SHIFT    14
+#define VTCR_TG0_4K      (0x0 << VTCR_TG0_SHIFT)
+#define VTCR_TG0_64K     (0x1 << VTCR_TG0_SHIFT)
+
+#define VTCR_PS_SHIFT     16
+#define VTCR_PS_32BIT    (0x0 << VTCR_PS_SHIFT)
+#define VTCR_PS_40BIT    (0x2 << VTCR_PS_SHIFT)
+#define VTCR_PS_48BIT    (0x5 << VTCR_PS_SHIFT)
+#define VTCR_PS_48BIT_VAL   0x5
+
+#ifdef CONFIG_ARM_64
+/*
+ * SL0=10 => Level-0 initial look up level
+ * SH0=11 => Inner-shareable
+ * ORGN0=IRGN0=01 => Normal memory, Write-Back Write-Allocate Cacheable
+ * TG0=00 => 4K page granular size
+ */
+#define VTCR_VAL_BASE  ((VTCR_SL0_0)      | \
+                        (VTCR_IRGN0_WBWA) | \
+                        (VTCR_ORGN0_WBWA) | \
+                        (VTCR_SH0_OS)     | \
+                        (VTCR_TG0_4K))
+#else
+/*
+ * T0SZ=(1)1000 => -8 (32-(-8) = 40 bit IPA)
+ * SL0=01 => Level-1 initial look up level
+ * SH0=11 => Inner-shareable
+ * ORGN0=IRGN0=01 => Normal memory, Write-Back Write-Allocate Cacheable
+ * TG0=00 => 4K page granular size
+ * PS=010 => 40 bits
+ * 40 bit IPA and 32 bit PA
+ */
+#define VTCR_VAL_BASE  ((VTCR_TOSZ_40BIT) | \
+                       (VTCR_SL0_1)      | \
+                       (VTCR_IRGN0_WBWA) | \
+                       (VTCR_ORGN0_WBWA) | \
+                       (VTCR_SH0_OS)     | \
+                       (VTCR_TG0_4K)     | \
+                       (VTCR_PS_32BIT))
+#endif
+
+/* TCR register configuration for Xen Stage 1 translation*/
+
+#define TCR_TBI_SHIFT       20
+#define TCR_TBI_USE_TBYTE  (0x0 << TCR_TBI_SHIFT)
+
+#ifdef CONFIG_ARM_64
+/* 
+ * 48 bit Hypervisor - VA  to 40 bit PA
+ * if platform supports 48 bit PA update runtime in head.S
+ */
+#define TCR_VAL_BASE   ((VTCR_TOSZ_48BIT)   | \
+                       (VTCR_IRGN0_WBWA)   | \
+                       (VTCR_ORGN0_WBWA)   | \
+                       (VTCR_SH0_OS)       | \
+                       (VTCR_TG0_4K)       | \
+                       (VTCR_PS_40BIT)     | \
+                       (TCR_TBI_USE_TBYTE))
+#endif
+
 /* TTBCR Translation Table Base Control Register */
 #define TTBCR_EAE    _AC(0x80000000,U)
 #define TTBCR_N_MASK _AC(0x07,U)
@@ -202,8 +291,19 @@ struct cpuinfo_arm {
         uint64_t bits[2];
     } aux64;
 
-    struct {
+    union {
         uint64_t bits[2];
+        struct {
+            unsigned long pa_range:4;
+            unsigned long asid_bits:4;
+            unsigned long bigend:4;
+            unsigned long secure_ns:4;
+            unsigned long bigend_el0:4;
+            unsigned long tgranule_16K:4;
+            unsigned long tgranule_64K:4;
+            unsigned long tgranule_4K:4;
+            unsigned long __res0:32;
+       };
     } mm64;
 
     struct {
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 2/3] xen/arm: update page table walk to handle 4 level page table
  2014-05-27  6:46 [PATCH v2 0/3] xen/arm: Add stage 2 48-bit PA support vijay.kilari
  2014-05-27  6:46 ` [PATCH v2 1/3] xen/arm: Add 4-level page table for stage 2 translation vijay.kilari
@ 2014-05-27  6:46 ` vijay.kilari
  2014-05-28 14:14   ` Ian Campbell
  2014-05-27  6:46 ` [PATCH v2 3/3] xen/arm: remove unused VADDR_BITS and VADDR_MASK macros vijay.kilari
  2 siblings, 1 reply; 11+ messages in thread
From: vijay.kilari @ 2014-05-27  6:46 UTC (permalink / raw)
  To: Ian.Campbell, julien.grall, stefano.stabellini,
	stefano.stabellini, xen-devel
  Cc: Prasun.Kapoor, Vijaya Kumar K, vijay.kilari

From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>

ARM64 supports 4-level page tables. Update page
table walk function to handle 4 levels

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
---
 xen/arch/arm/mm.c |   19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index 04e3182..ef3c53e 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -162,13 +162,25 @@ static inline void check_memory_layout_alignment_constraints(void) {
 #endif
 }
 
-void dump_pt_walk(lpae_t *first, paddr_t addr)
+void dump_pt_walk(lpae_t *root, paddr_t addr)
 {
     lpae_t *second = NULL, *third = NULL;
+    lpae_t *first = NULL;
+#ifdef CONFIG_ARM_64
+    if ( zeroeth_table_offset(addr) >= LPAE_ENTRIES )
+        return;
 
+    printk("0TH[0x%x] = 0x%"PRIpaddr"\n", zeroeth_table_offset(addr),
+           root[zeroeth_table_offset(addr)].bits);
+    if ( !root[zeroeth_table_offset(addr)].walk.valid )
+        goto done;
+
+    first = map_domain_page(root[zeroeth_table_offset(addr)].walk.base);
+#else
+    first = root;
     if ( first_table_offset(addr) >= LPAE_ENTRIES )
         return;
-
+#endif
     printk("1ST[0x%x] = 0x%"PRIpaddr"\n", first_table_offset(addr),
            first[first_table_offset(addr)].bits);
     if ( !first[first_table_offset(addr)].walk.valid ||
@@ -189,6 +201,9 @@ void dump_pt_walk(lpae_t *first, paddr_t addr)
 done:
     if (third) unmap_domain_page(third);
     if (second) unmap_domain_page(second);
+#ifdef CONFIG_ARM_64
+    if ( first ) unmap_domain_page(first);
+#endif
 
 }
 
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 3/3] xen/arm: remove unused VADDR_BITS and VADDR_MASK macros
  2014-05-27  6:46 [PATCH v2 0/3] xen/arm: Add stage 2 48-bit PA support vijay.kilari
  2014-05-27  6:46 ` [PATCH v2 1/3] xen/arm: Add 4-level page table for stage 2 translation vijay.kilari
  2014-05-27  6:46 ` [PATCH v2 2/3] xen/arm: update page table walk to handle 4 level page table vijay.kilari
@ 2014-05-27  6:46 ` vijay.kilari
  2014-05-28 14:15   ` Ian Campbell
  2 siblings, 1 reply; 11+ messages in thread
From: vijay.kilari @ 2014-05-27  6:46 UTC (permalink / raw)
  To: Ian.Campbell, julien.grall, stefano.stabellini,
	stefano.stabellini, xen-devel
  Cc: Prasun.Kapoor, Vijaya Kumar K, vijay.kilari

From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>

Remove unused VADDR_BITS and VADDR_MASK macros

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
---
 xen/include/asm-arm/page.h |    3 ---
 1 file changed, 3 deletions(-)

diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h
index 31c08b4..e0854c9 100644
--- a/xen/include/asm-arm/page.h
+++ b/xen/include/asm-arm/page.h
@@ -13,9 +13,6 @@
 #endif
 #define PADDR_MASK              ((1ULL << PADDR_BITS)-1)
 
-#define VADDR_BITS              32
-#define VADDR_MASK              (~0UL)
-
 /* Shareability values for the LPAE entries */
 #define LPAE_SH_NON_SHAREABLE 0x0
 #define LPAE_SH_UNPREDICTALE  0x1
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 2/3] xen/arm: update page table walk to handle 4 level page table
  2014-05-27  6:46 ` [PATCH v2 2/3] xen/arm: update page table walk to handle 4 level page table vijay.kilari
@ 2014-05-28 14:14   ` Ian Campbell
  0 siblings, 0 replies; 11+ messages in thread
From: Ian Campbell @ 2014-05-28 14:14 UTC (permalink / raw)
  To: vijay.kilari
  Cc: stefano.stabellini, Prasun.Kapoor, vijaya.kumar, julien.grall,
	xen-devel, stefano.stabellini

On Tue, 2014-05-27 at 12:16 +0530, vijay.kilari@gmail.com wrote:
> From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
> 
> ARM64 supports 4-level page tables. Update page
> table walk function to handle 4 levels
> 
> Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>

Acked-by: Ian Campbell <ian.campbell@citrix.com>

> ---
>  xen/arch/arm/mm.c |   19 +++++++++++++++++--
>  1 file changed, 17 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
> index 04e3182..ef3c53e 100644
> --- a/xen/arch/arm/mm.c
> +++ b/xen/arch/arm/mm.c
> @@ -162,13 +162,25 @@ static inline void check_memory_layout_alignment_constraints(void) {
>  #endif
>  }
>  
> -void dump_pt_walk(lpae_t *first, paddr_t addr)
> +void dump_pt_walk(lpae_t *root, paddr_t addr)
>  {
>      lpae_t *second = NULL, *third = NULL;
> +    lpae_t *first = NULL;
> +#ifdef CONFIG_ARM_64
> +    if ( zeroeth_table_offset(addr) >= LPAE_ENTRIES )
> +        return;
>  
> +    printk("0TH[0x%x] = 0x%"PRIpaddr"\n", zeroeth_table_offset(addr),
> +           root[zeroeth_table_offset(addr)].bits);
> +    if ( !root[zeroeth_table_offset(addr)].walk.valid )
> +        goto done;
> +
> +    first = map_domain_page(root[zeroeth_table_offset(addr)].walk.base);
> +#else
> +    first = root;
>      if ( first_table_offset(addr) >= LPAE_ENTRIES )
>          return;
> -
> +#endif
>      printk("1ST[0x%x] = 0x%"PRIpaddr"\n", first_table_offset(addr),
>             first[first_table_offset(addr)].bits);
>      if ( !first[first_table_offset(addr)].walk.valid ||
> @@ -189,6 +201,9 @@ void dump_pt_walk(lpae_t *first, paddr_t addr)
>  done:
>      if (third) unmap_domain_page(third);
>      if (second) unmap_domain_page(second);
> +#ifdef CONFIG_ARM_64
> +    if ( first ) unmap_domain_page(first);
> +#endif
>  
>  }
>  

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 3/3] xen/arm: remove unused VADDR_BITS and VADDR_MASK macros
  2014-05-27  6:46 ` [PATCH v2 3/3] xen/arm: remove unused VADDR_BITS and VADDR_MASK macros vijay.kilari
@ 2014-05-28 14:15   ` Ian Campbell
  0 siblings, 0 replies; 11+ messages in thread
From: Ian Campbell @ 2014-05-28 14:15 UTC (permalink / raw)
  To: vijay.kilari
  Cc: stefano.stabellini, Prasun.Kapoor, vijaya.kumar, julien.grall,
	xen-devel, stefano.stabellini

On Tue, 2014-05-27 at 12:16 +0530, vijay.kilari@gmail.com wrote:
> From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
> 
> Remove unused VADDR_BITS and VADDR_MASK macros
> 
> Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>

Acked-by: Ian Campbell <ian.campbell@citrix.com>

> ---
>  xen/include/asm-arm/page.h |    3 ---
>  1 file changed, 3 deletions(-)
> 
> diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h
> index 31c08b4..e0854c9 100644
> --- a/xen/include/asm-arm/page.h
> +++ b/xen/include/asm-arm/page.h
> @@ -13,9 +13,6 @@
>  #endif
>  #define PADDR_MASK              ((1ULL << PADDR_BITS)-1)
>  
> -#define VADDR_BITS              32
> -#define VADDR_MASK              (~0UL)
> -
>  /* Shareability values for the LPAE entries */
>  #define LPAE_SH_NON_SHAREABLE 0x0
>  #define LPAE_SH_UNPREDICTALE  0x1

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 1/3] xen/arm: Add 4-level page table for stage 2 translation
  2014-05-27  6:46 ` [PATCH v2 1/3] xen/arm: Add 4-level page table for stage 2 translation vijay.kilari
@ 2014-05-28 14:29   ` Ian Campbell
  2014-07-15 13:47   ` Ian Campbell
  1 sibling, 0 replies; 11+ messages in thread
From: Ian Campbell @ 2014-05-28 14:29 UTC (permalink / raw)
  To: vijay.kilari
  Cc: stefano.stabellini, Prasun.Kapoor, vijaya.kumar, julien.grall,
	xen-devel, stefano.stabellini

On Tue, 2014-05-27 at 12:16 +0530, vijay.kilari@gmail.com wrote:
> From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
> 
> To support 48-bit Physical Address support, add 4-level
> page tables for stage 2 translation.
> With this patch stage 1 and stage 2 translation at EL2 are
> with 4-levels
> 
> Configure TCR_EL2.IPS and VTCR_EL2.PS based on platform
> supported PA range at runtime
> 
> Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
> ---
>  xen/arch/arm/arm64/head.S       |   14 ++--
>  xen/arch/arm/mm.c               |   18 +++---
>  xen/arch/arm/p2m.c              |  136 +++++++++++++++++++++++++++++++++------
>  xen/include/asm-arm/p2m.h       |    5 +-
>  xen/include/asm-arm/page.h      |   16 +++--
>  xen/include/asm-arm/processor.h |  102 ++++++++++++++++++++++++++++-
>  6 files changed, 248 insertions(+), 43 deletions(-)
> 
> diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
> index 2a13527..8396268 100644
> --- a/xen/arch/arm/arm64/head.S
> +++ b/xen/arch/arm/arm64/head.S
> @@ -224,13 +224,13 @@ skip_bss:
>          ldr   x0, =MAIRVAL
>          msr   mair_el2, x0
>  
> -        /* Set up the HTCR:
> -         * PASize -- 40 bits / 1TB
> -         * Top byte is used
> -         * PT walks use Inner-Shareable accesses,
> -         * PT walks are write-back, write-allocate in both cache levels,
> -         * Full 64-bit address space goes through this table. */
> -        ldr   x0, =0x80823500
> +        mrs   x1, ID_AA64MMFR0_EL1
> +
> +        /* Set up the HTCR */
> +        ldr   x0, =TCR_VAL_BASE
> +
> +        /* Set TCR_EL2.IPS based on ID_AA64MMFR0_EL1.PARange */
> +        bfi   x0, x1, #16, #3
>          msr   tcr_el2, x0
>  
>          /* Set up the SCTLR_EL2:
> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
> index eac228c..04e3182 100644
> --- a/xen/arch/arm/mm.c
> +++ b/xen/arch/arm/mm.c
> @@ -377,17 +377,17 @@ void __init arch_init_memory(void)
>  void __cpuinit setup_virt_paging(void)
>  {
>      /* Setup Stage 2 address translation */
> -    /* SH0=11 (Inner-shareable)
> -     * ORGN0=IRGN0=01 (Normal memory, Write-Back Write-Allocate Cacheable)
> -     * SL0=01 (Level-1)
> -     * ARVv7: T0SZ=(1)1000 = -8 (32-(-8) = 40 bit physical addresses)
> -     * ARMv8: T0SZ=01 1000 = 24 (64-24   = 40 bit physical addresses)
> -     *        PS=010 == 40 bits
> -     */
>  #ifdef CONFIG_ARM_32
> -    WRITE_SYSREG32(0x80003558, VTCR_EL2);
> +    WRITE_SYSREG32(VTCR_VAL_BASE, VTCR_EL2);
>  #else
> -    WRITE_SYSREG32(0x80023558, VTCR_EL2);
> +    /* Update IPA 48 bit and PA 48 bit */
> +    if ( current_cpu_data.mm64.pa_range == VTCR_PS_48BIT_VAL )

What about values other than 48 or 40 bit?

Why is the content of ID_AA64MMFR0_EL1 being compared to a #define
relating to VTCR? Can't mm64.pa_range be transformed directly into the
right value for VTCR without needing to jump through these hoops (like
what you do for tcr_el2)?


> +        WRITE_SYSREG32(VTCR_VAL_BASE | VTCR_TOSZ_48BIT | VTCR_PS_48BIT,
> +                       VTCR_EL2);
> +    else
> +        /* default to IPA 48 bit and PA 40 bit */
> +        WRITE_SYSREG32(VTCR_VAL_BASE | VTCR_TOSZ_40BIT | VTCR_PS_40BIT,
> +                       VTCR_EL2);
>  #endif
>      isb();
>  }
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index 603c097..045c003 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -10,29 +10,42 @@
>  #include <asm/hardirq.h>
>  #include <asm/page.h>
>  
> +#ifdef CONFIG_ARM_64
> +/* Zeroeth level is of 1 page size */
> +#define P2M_ROOT_ORDER 0
> +#else
>  /* First level P2M is 2 consecutive pages */
> -#define P2M_FIRST_ORDER 1
> -#define P2M_FIRST_ENTRIES (LPAE_ENTRIES<<P2M_FIRST_ORDER)
> +#define P2M_ROOT_ORDER 1
> +#endif
> +#define P2M_FIRST_ENTRIES (LPAE_ENTRIES << P2M_ROOT_ORDER)
>  
>  void dump_p2m_lookup(struct domain *d, paddr_t addr)
>  {
>      struct p2m_domain *p2m = &d->arch.p2m;
> -    lpae_t *first;
> +    lpae_t *lookup;
>  
>      printk("dom%d IPA 0x%"PRIpaddr"\n", d->domain_id, addr);
>  
> -    if ( first_linear_offset(addr) > LPAE_ENTRIES )
> +#ifdef CONFIG_ARM_64
> +    if ( zeroeth_linear_offset(addr) > P2M_FIRST_ENTRIES )
> +    {
> +        printk("Beyond number of support entries at zeroeth level\n");

"supported".

But actually I think this would be a programming error and therefore
could be an assert or BUG_ON.

> +        return;
> +    }
> +#else
> +    if ( first_linear_offset(addr) > P2M_FIRST_ENTRIES )
>      {
>          printk("Cannot dump addresses in second of first level pages...\n");
>          return;
>      }
> +#endif
>  
> @@ -107,6 +135,9 @@ paddr_t p2m_lookup(struct domain *d, paddr_t paddr, p2m_type_t *t)
>  {
>      struct p2m_domain *p2m = &d->arch.p2m;
>      lpae_t pte, *first = NULL, *second = NULL, *third = NULL;
> +#ifdef CONFIG_ARM_64
> +    lpae_t *zeroeth = NULL;
> +#endif
>      paddr_t maddr = INVALID_PADDR;
>      p2m_type_t _t;
>  
> @@ -117,9 +148,26 @@ paddr_t p2m_lookup(struct domain *d, paddr_t paddr, p2m_type_t *t)
>  
>      spin_lock(&p2m->lock);
>  
> +#ifdef CONFIG_ARM_64
> +    zeroeth = p2m_map_zeroeth(p2m, paddr);

Naming this function p2m_map_root might allow you to get rid of some of
the ifdefs.

> +    if ( !zeroeth )
> +        goto err;
> +
> +    pte = zeroeth[zeroeth_table_offset(paddr)];
> +    /* Zeroeth level does not support block translation
> +     * so pte.p2m.table should be always set.

ASSERT/BUG_ON?

> @@ -541,13 +639,15 @@ int p2m_alloc_table(struct domain *d)
>      clear_page(p);
>      unmap_domain_page(p);
>  
> +#ifdef CONFIG_ARM_32

Since you've defined it and used it for the allocation this should be
based on P2M_ROOT_ORDER I think.


>      p = __map_domain_page(page + 1);
>      clear_page(p);
>      unmap_domain_page(p);
> +#endif
>  
> -    p2m->first_level = page;
> +    p2m->root_level = page;

You could profitable have done this rename in a precursor patch, which I
think would have reduced the size of this one to more manageable size.
Nevermind now though.


> @@ -110,8 +114,8 @@ typedef struct __packed {
>      unsigned long ng:1;         /* Not-Global */
>  
>      /* The base address must be appropriately aligned for Block entries */
> -    unsigned long base:28;      /* Base address of block or next table */
> -    unsigned long sbz:12;       /* Must be zero */
> +    unsigned long base:36;      /* Base address of block or next table */
> +    unsigned long sbz:4;       /* Must be zero */

Alignment has gotten undone here.

>  
>      /* These seven bits are only used in Block entries and are ignored
>       * in Table entries. */
> diff --git a/xen/include/asm-arm/processor.h b/xen/include/asm-arm/processor.h
> index 5978b8a..23c2f66 100644
> --- a/xen/include/asm-arm/processor.h
> +++ b/xen/include/asm-arm/processor.h
> @@ -31,6 +31,95 @@
>  #define MPIDR_AFFINITY_LEVEL(mpidr, level) \
>           ((mpidr >> MPIDR_LEVEL_SHIFT(level)) & MPIDR_LEVEL_MASK)
>  
> +/* 
> + * VTCR register configuration for stage 2 translation
> + */
> +#define VTCR_T0SZ_SHIFT   0
> +#define VTCR_TOSZ_40BIT  (24 << VTCR_T0SZ_SHIFT)
> +#define VTCR_TOSZ_48BIT  (16 << VTCR_T0SZ_SHIFT)
> +
> +#define VTCR_SL0_SHIFT    6
> +#define VTCR_SL0_0       (0x2 << VTCR_SL0_SHIFT)
> +#define VTCR_SL0_1       (0x1 << VTCR_SL0_SHIFT)
> +#define VTCR_SL0_2       (0x0 << VTCR_SL0_SHIFT)
> +
> +#define VTCR_IRGN0_SHIFT  8
> +#define VTCR_IRGN0_NC    (0x0 << VTCR_IRGN0_SHIFT)
> +#define VTCR_IRGN0_WBWA  (0x1 << VTCR_IRGN0_SHIFT)
> +#define VTCR_IRGN0_WT    (0x2 << VTCR_IRGN0_SHIFT)
> +#define VTCR_IRGN0_WB    (0x3 << VTCR_IRGN0_SHIFT)
> +
> +#define VTCR_ORGN0_SHIFT  10
> +#define VTCR_ORGN0_NC    (0x0 << VTCR_ORGN0_SHIFT)
> +#define VTCR_ORGN0_WBWA  (0x1 << VTCR_ORGN0_SHIFT)
> +#define VTCR_ORGN0_WT    (0x2 << VTCR_ORGN0_SHIFT)
> +#define VTCR_ORGN0_WB    (0x3 << VTCR_ORGN0_SHIFT)
> +
> +#define VTCR_SH0_SHIFT    12
> +#define VTCR_SH0_NS      (0x0 << VTCR_SH0_SHIFT)
> +#define VTCR_SH0_OS      (0x2 << VTCR_SH0_SHIFT)
> +#define VTCR_SH0_IS      (0x3 << VTCR_SH0_SHIFT)
> +
> +#define VTCR_TG0_SHIFT    14
> +#define VTCR_TG0_4K      (0x0 << VTCR_TG0_SHIFT)
> +#define VTCR_TG0_64K     (0x1 << VTCR_TG0_SHIFT)
> +
> +#define VTCR_PS_SHIFT     16
> +#define VTCR_PS_32BIT    (0x0 << VTCR_PS_SHIFT)
> +#define VTCR_PS_40BIT    (0x2 << VTCR_PS_SHIFT)
> +#define VTCR_PS_48BIT    (0x5 << VTCR_PS_SHIFT)
> +#define VTCR_PS_48BIT_VAL   0x5

This spurious _VAL is interesting...

> +
> +#ifdef CONFIG_ARM_64
> +/*
> + * SL0=10 => Level-0 initial look up level
> + * SH0=11 => Inner-shareable
> + * ORGN0=IRGN0=01 => Normal memory, Write-Back Write-Allocate Cacheable
> + * TG0=00 => 4K page granular size
> + */
> +#define VTCR_VAL_BASE  ((VTCR_SL0_0)      | \
> +                        (VTCR_IRGN0_WBWA) | \
> +                        (VTCR_ORGN0_WBWA) | \
> +                        (VTCR_SH0_OS)     | \
> +                        (VTCR_TG0_4K))
> +#else
> +/*
> + * T0SZ=(1)1000 => -8 (32-(-8) = 40 bit IPA)
> + * SL0=01 => Level-1 initial look up level
> + * SH0=11 => Inner-shareable
> + * ORGN0=IRGN0=01 => Normal memory, Write-Back Write-Allocate Cacheable
> + * TG0=00 => 4K page granular size
> + * PS=010 => 40 bits
> + * 40 bit IPA and 32 bit PA
> + */
> +#define VTCR_VAL_BASE  ((VTCR_TOSZ_40BIT) | \

I didn't see you actually patching arm32/head.S to use this.

> +                       (VTCR_SL0_1)      | \
> +                       (VTCR_IRGN0_WBWA) | \
> +                       (VTCR_ORGN0_WBWA) | \
> +                       (VTCR_SH0_OS)     | \
> +                       (VTCR_TG0_4K)     | \
> +                       (VTCR_PS_32BIT))
> +#endif
> +
> +/* TCR register configuration for Xen Stage 1 translation*/
> +
> +#define TCR_TBI_SHIFT       20
> +#define TCR_TBI_USE_TBYTE  (0x0 << TCR_TBI_SHIFT)

0x0? Did you mean 0x1?

Isn't this 64-bit specific

> +
> +#ifdef CONFIG_ARM_64
> +/* 
> + * 48 bit Hypervisor - VA  to 40 bit PA

This is far less information than was in the head.S comment which you
removed.

> + * if platform supports 48 bit PA update runtime in head.S

I think you should omit VTCR_PS_* here altogether and require that it is
unconditionally set appropriately in head.S (which seems to be what you
have implemented anyway)

> + */
> +#define TCR_VAL_BASE   ((VTCR_TOSZ_48BIT)   | \
> +                       (VTCR_IRGN0_WBWA)   | \
> +                       (VTCR_ORGN0_WBWA)   | \
> +                       (VTCR_SH0_OS)       | \
> +                       (VTCR_TG0_4K)       | \
> +                       (VTCR_PS_40BIT)     | \
> +                       (TCR_TBI_USE_TBYTE))

No 32-bit equivalent?


Ian.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 1/3] xen/arm: Add 4-level page table for stage 2 translation
  2014-05-27  6:46 ` [PATCH v2 1/3] xen/arm: Add 4-level page table for stage 2 translation vijay.kilari
  2014-05-28 14:29   ` Ian Campbell
@ 2014-07-15 13:47   ` Ian Campbell
  2014-07-16 11:42     ` Vijay Kilari
  1 sibling, 1 reply; 11+ messages in thread
From: Ian Campbell @ 2014-07-15 13:47 UTC (permalink / raw)
  To: vijay.kilari
  Cc: stefano.stabellini, Prasun.Kapoor, vijaya.kumar, julien.grall,
	xen-devel, stefano.stabellini

On Tue, 2014-05-27 at 12:16 +0530, vijay.kilari@gmail.com wrote:
> From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
> 
> To support 48-bit Physical Address support, add 4-level
> page tables for stage 2 translation.
> With this patch stage 1 and stage 2 translation at EL2 are
> with 4-levels

I've been playing with this patch on various platforms and unfortunately
there is a rather large snag here, AFAICT. Which is that 4-level stage 2
pages are only available when ID_AA64MMFR0_EL1.PARange >= 44 bits, see
table D4-5 in the ARMv8 ARM.

This means that for processors with smaller physical address ranges
configuring VTCR_EL2.SL0 as 0b10 (starting level 0) causes translation
fault. I'm observing this on platforms which have 40 or 42 bit PARange
for example.

This basically means that it is not possible to simply use 4-levels
statically on arm64, which I think I may have previously asserted to you
would be fine, sorry -- I hadn't appreciated the full implications of
this bit of the ARM -- I was confused because SL0 is the inverse of the
starting level it configures, so I read it as saying the maximum
starting level was 1 or 2 for all PARanges, but actually it is either 1
or 0 depending on the range (encoded as 1 and 2 in SL0). Sigh.

Anyway, this means that if we want to support 44+-bit hardware then we
need to dynamically switch between 3- and 4-level tables depending on
the h/w capabilities. This is doable but a bit annoying.

I know you have a platform which is 48-bit capable but can you confirm
that there are things (peripherals, RAM, etc) mapped above the 2^40
limit before I start to implement this?

Alternatively if you think my reading of the ARMv8 ARM is wrong please
say so.

I also note that different processors in a big.LITTLE system can have
different PARanges. That's going to be fun!

Ian.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 1/3] xen/arm: Add 4-level page table for stage 2 translation
  2014-07-15 13:47   ` Ian Campbell
@ 2014-07-16 11:42     ` Vijay Kilari
  2014-07-16 12:35       ` Ian Campbell
  0 siblings, 1 reply; 11+ messages in thread
From: Vijay Kilari @ 2014-07-16 11:42 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K, Julien Grall,
	xen-devel@lists.xen.org, Stefano Stabellini

On Tue, Jul 15, 2014 at 7:17 PM, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> On Tue, 2014-05-27 at 12:16 +0530, vijay.kilari@gmail.com wrote:
>> From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
>>
>> To support 48-bit Physical Address support, add 4-level
>> page tables for stage 2 translation.
>> With this patch stage 1 and stage 2 translation at EL2 are
>> with 4-levels
>
> I've been playing with this patch on various platforms and unfortunately
> there is a rather large snag here, AFAICT. Which is that 4-level stage 2
> pages are only available when ID_AA64MMFR0_EL1.PARange >= 44 bits, see
> table D4-5 in the ARMv8 ARM.

 I read from "Table D5-18 Properties of the address lookup levels, 4KB
granule size"
Where the lookup level and granule size details are provided. For
above 39 bit PA
zero level is used.

IMO, conversion above 40 bits, 4 levels are required. However if we use
concatenation, conversion can be managed with 3 levels.
Please see "Figure D5-8 General view of VMSAv8-64 stage 2 address translation,
4KB granule".
I think Xen is using concatenation method to handle 40 bits conversion.

>
> This means that for processors with smaller physical address ranges
> configuring VTCR_EL2.SL0 as 0b10 (starting level 0) causes translation
> fault. I'm observing this on platforms which have 40 or 42 bit PARange
> for example.
>
> This basically means that it is not possible to simply use 4-levels
> statically on arm64, which I think I may have previously asserted to you
> would be fine, sorry -- I hadn't appreciated the full implications of
> this bit of the ARM -- I was confused because SL0 is the inverse of the
> starting level it configures, so I read it as saying the maximum
> starting level was 1 or 2 for all PARanges, but actually it is either 1
> or 0 depending on the range (encoded as 1 and 2 in SL0). Sigh.
>
> Anyway, this means that if we want to support 44+-bit hardware then we
> need to dynamically switch between 3- and 4-level tables depending on
> the h/w capabilities. This is doable but a bit annoying.
>
> I know you have a platform which is 48-bit capable but can you confirm
> that there are things (peripherals, RAM, etc) mapped above the 2^40
> limit before I start to implement this?

Yes, our platform has peripherals placed above 2^40 limit. where I could
access with 48-bit support patch series

>
> Alternatively if you think my reading of the ARMv8 ARM is wrong please
> say so.
>
> I also note that different processors in a big.LITTLE system can have
> different PARanges. That's going to be fun!
>
> Ian.
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 1/3] xen/arm: Add 4-level page table for stage 2 translation
  2014-07-16 11:42     ` Vijay Kilari
@ 2014-07-16 12:35       ` Ian Campbell
  2014-07-16 13:53         ` Vijay Kilari
  0 siblings, 1 reply; 11+ messages in thread
From: Ian Campbell @ 2014-07-16 12:35 UTC (permalink / raw)
  To: Vijay Kilari
  Cc: Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K, Julien Grall,
	xen-devel@lists.xen.org, Stefano Stabellini

On Wed, 2014-07-16 at 17:12 +0530, Vijay Kilari wrote:
> On Tue, Jul 15, 2014 at 7:17 PM, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> > On Tue, 2014-05-27 at 12:16 +0530, vijay.kilari@gmail.com wrote:
> >> From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
> >>
> >> To support 48-bit Physical Address support, add 4-level
> >> page tables for stage 2 translation.
> >> With this patch stage 1 and stage 2 translation at EL2 are
> >> with 4-levels
> >
> > I've been playing with this patch on various platforms and unfortunately
> > there is a rather large snag here, AFAICT. Which is that 4-level stage 2
> > pages are only available when ID_AA64MMFR0_EL1.PARange >= 44 bits, see
> > table D4-5 in the ARMv8 ARM.
> 
>  I read from "Table D5-18 Properties of the address lookup levels, 4KB
> granule size"
> Where the lookup level and granule size details are provided. For
> above 39 bit PA
> zero level is used.

This is D4-19 in my copy -- OOI which version of the ARM are you working
from? Mine is ARM DDI0487A.B.

> IMO, conversion above 40 bits, 4 levels are required. However if we use
> concatenation, conversion can be managed with 3 levels.
> Please see "Figure D5-8 General view of VMSAv8-64 stage 2 address translation,
> 4KB granule".
> I think Xen is using concatenation method to handle 40 bits conversion.

Correct, but that doesn't scale to 48bits, 3-levels maxs out at 16
concatenated first level tables allowing for 42 bit addressing.

So the upshot is that to support both 40- and 48-bit systems we must
dynamically support 4-level and 3-level (concatenated) tables.

> our platform has peripherals placed above 2^40 limit. where I could
> access with 48-bit support patch series

OK. Thanks.

Ian.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 1/3] xen/arm: Add 4-level page table for stage 2 translation
  2014-07-16 12:35       ` Ian Campbell
@ 2014-07-16 13:53         ` Vijay Kilari
  0 siblings, 0 replies; 11+ messages in thread
From: Vijay Kilari @ 2014-07-16 13:53 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K, Julien Grall,
	xen-devel@lists.xen.org, Stefano Stabellini

On Wed, Jul 16, 2014 at 6:05 PM, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> On Wed, 2014-07-16 at 17:12 +0530, Vijay Kilari wrote:
>> On Tue, Jul 15, 2014 at 7:17 PM, Ian Campbell <Ian.Campbell@citrix.com> wrote:
>> > On Tue, 2014-05-27 at 12:16 +0530, vijay.kilari@gmail.com wrote:
>> >> From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
>> >>
>> >> To support 48-bit Physical Address support, add 4-level
>> >> page tables for stage 2 translation.
>> >> With this patch stage 1 and stage 2 translation at EL2 are
>> >> with 4-levels
>> >
>> > I've been playing with this patch on various platforms and unfortunately
>> > there is a rather large snag here, AFAICT. Which is that 4-level stage 2
>> > pages are only available when ID_AA64MMFR0_EL1.PARange >= 44 bits, see
>> > table D4-5 in the ARMv8 ARM.
>>
>>  I read from "Table D5-18 Properties of the address lookup levels, 4KB
>> granule size"
>> Where the lookup level and granule size details are provided. For
>> above 39 bit PA
>> zero level is used.
>
> This is D4-19 in my copy -- OOI which version of the ARM are you working
> from? Mine is ARM DDI0487A.B.

   I am using ARM DDI 0487A.a-1 version. Now I downloaded .B version

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2014-07-16 13:53 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-05-27  6:46 [PATCH v2 0/3] xen/arm: Add stage 2 48-bit PA support vijay.kilari
2014-05-27  6:46 ` [PATCH v2 1/3] xen/arm: Add 4-level page table for stage 2 translation vijay.kilari
2014-05-28 14:29   ` Ian Campbell
2014-07-15 13:47   ` Ian Campbell
2014-07-16 11:42     ` Vijay Kilari
2014-07-16 12:35       ` Ian Campbell
2014-07-16 13:53         ` Vijay Kilari
2014-05-27  6:46 ` [PATCH v2 2/3] xen/arm: update page table walk to handle 4 level page table vijay.kilari
2014-05-28 14:14   ` Ian Campbell
2014-05-27  6:46 ` [PATCH v2 3/3] xen/arm: remove unused VADDR_BITS and VADDR_MASK macros vijay.kilari
2014-05-28 14:15   ` Ian Campbell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).