qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH qemu 0/2] exec, spapr_pci: Advertise huge IOMMU pages
@ 2016-12-22  5:22 Alexey Kardashevskiy
  2016-12-22  5:22 ` [Qemu-devel] [PATCH qemu 1/2] exec, kvm, target-ppc: Move getrampagesize() to common code Alexey Kardashevskiy
  2016-12-22  5:22 ` [Qemu-devel] [PATCH qemu 2/2] spapr_pci: Advertise 16M IOMMU pages when available Alexey Kardashevskiy
  0 siblings, 2 replies; 13+ messages in thread
From: Alexey Kardashevskiy @ 2016-12-22  5:22 UTC (permalink / raw)
  To: qemu-devel
  Cc: Alexey Kardashevskiy, qemu-ppc, Michael S . Tsirkin, David Gibson

These are to enable 16mb pages for IOMMU which should reduce
the amount of memory used for TCE tables in 256 times (compared
to 64k pages case). Please comment. Thanks!

Alexey Kardashevskiy (2):
  exec, kvm, target-ppc: Move getrampagesize() to common code
  spapr_pci: Advertise 16M IOMMU pages when available

 include/exec/ram_addr.h   |   1 +
 include/qemu/mmap-alloc.h |   2 +
 exec.c                    |  82 ++++++++++++++++++++++++++++++++++++
 hw/ppc/spapr_pci.c        |   3 ++
 target-ppc/kvm.c          | 105 ++--------------------------------------------
 util/mmap-alloc.c         |  25 +++++++++++
 6 files changed, 116 insertions(+), 102 deletions(-)

-- 
2.11.0

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH qemu 1/2] exec, kvm, target-ppc: Move getrampagesize() to common code
  2016-12-22  5:22 [Qemu-devel] [PATCH qemu 0/2] exec, spapr_pci: Advertise huge IOMMU pages Alexey Kardashevskiy
@ 2016-12-22  5:22 ` Alexey Kardashevskiy
  2017-01-02 23:34   ` David Gibson
  2016-12-22  5:22 ` [Qemu-devel] [PATCH qemu 2/2] spapr_pci: Advertise 16M IOMMU pages when available Alexey Kardashevskiy
  1 sibling, 1 reply; 13+ messages in thread
From: Alexey Kardashevskiy @ 2016-12-22  5:22 UTC (permalink / raw)
  To: qemu-devel; +Cc: Alexey Kardashevskiy, qemu-ppc, David Gibson

getrampagesize() returns the largest supported page size and mainly
used to know if huge pages are enabled.

However is implemented in target-ppc/kvm.c and not available
in TCG or other architectures.

This renames and moves gethugepagesize() to mmap-alloc.c where
fd-based analog of it is already implemented. This renames and moves
getrampagesize() to exec.c as it seems to be the common place for
helpers like this.

This first user for it is going to be a spapr-pci-host-bridge which
needs to know the largest RAM page size so the guest could try
using bigger IOMMU pages to save memory.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
 include/exec/ram_addr.h   |   1 +
 include/qemu/mmap-alloc.h |   2 +
 exec.c                    |  82 ++++++++++++++++++++++++++++++++++++
 target-ppc/kvm.c          | 105 ++--------------------------------------------
 util/mmap-alloc.c         |  25 +++++++++++
 5 files changed, 113 insertions(+), 102 deletions(-)

diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
index 54d7108a9e..3935cbcfcd 100644
--- a/include/exec/ram_addr.h
+++ b/include/exec/ram_addr.h
@@ -91,6 +91,7 @@ typedef struct RAMList {
 } RAMList;
 extern RAMList ram_list;
 
+long qemu_getrampagesize(void);
 ram_addr_t last_ram_offset(void);
 void qemu_mutex_lock_ramlist(void);
 void qemu_mutex_unlock_ramlist(void);
diff --git a/include/qemu/mmap-alloc.h b/include/qemu/mmap-alloc.h
index 933c024ac5..50385e3f81 100644
--- a/include/qemu/mmap-alloc.h
+++ b/include/qemu/mmap-alloc.h
@@ -5,6 +5,8 @@
 
 size_t qemu_fd_getpagesize(int fd);
 
+size_t qemu_mempath_getpagesize(const char *mem_path);
+
 void *qemu_ram_mmap(int fd, size_t size, size_t align, bool shared);
 
 void qemu_ram_munmap(void *ptr, size_t size);
diff --git a/exec.c b/exec.c
index 08c558eecf..d73b477a70 100644
--- a/exec.c
+++ b/exec.c
@@ -32,6 +32,7 @@
 #endif
 #include "sysemu/kvm.h"
 #include "sysemu/sysemu.h"
+#include "sysemu/numa.h"
 #include "qemu/timer.h"
 #include "qemu/config-file.h"
 #include "qemu/error-report.h"
@@ -1218,6 +1219,87 @@ void qemu_mutex_unlock_ramlist(void)
 }
 
 #ifdef __linux__
+/*
+ * FIXME TOCTTOU: this iterates over memory backends' mem-path, which
+ * may or may not name the same files / on the same filesystem now as
+ * when we actually open and map them.  Iterate over the file
+ * descriptors instead, and use qemu_fd_getpagesize().
+ */
+static int find_max_supported_pagesize(Object *obj, void *opaque)
+{
+    char *mem_path;
+    long *hpsize_min = opaque;
+
+    if (object_dynamic_cast(obj, TYPE_MEMORY_BACKEND)) {
+        mem_path = object_property_get_str(obj, "mem-path", NULL);
+        if (mem_path) {
+            long hpsize = qemu_mempath_getpagesize(mem_path);
+            if (hpsize < *hpsize_min) {
+                *hpsize_min = hpsize;
+            }
+        } else {
+            *hpsize_min = getpagesize();
+        }
+    }
+
+    return 0;
+}
+
+long qemu_getrampagesize(void)
+{
+    long hpsize = LONG_MAX;
+    long mainrampagesize;
+    Object *memdev_root;
+
+    if (mem_path) {
+        mainrampagesize = qemu_mempath_getpagesize(mem_path);
+    } else {
+        mainrampagesize = getpagesize();
+    }
+
+    /* it's possible we have memory-backend objects with
+     * hugepage-backed RAM. these may get mapped into system
+     * address space via -numa parameters or memory hotplug
+     * hooks. we want to take these into account, but we
+     * also want to make sure these supported hugepage
+     * sizes are applicable across the entire range of memory
+     * we may boot from, so we take the min across all
+     * backends, and assume normal pages in cases where a
+     * backend isn't backed by hugepages.
+     */
+    memdev_root = object_resolve_path("/objects", NULL);
+    if (memdev_root) {
+        object_child_foreach(memdev_root, find_max_supported_pagesize, &hpsize);
+    }
+    if (hpsize == LONG_MAX) {
+        /* No additional memory regions found ==> Report main RAM page size */
+        return mainrampagesize;
+    }
+
+    /* If NUMA is disabled or the NUMA nodes are not backed with a
+     * memory-backend, then there is at least one node using "normal" RAM,
+     * so if its page size is smaller we have got to report that size instead.
+     */
+    if (hpsize > mainrampagesize &&
+        (nb_numa_nodes == 0 || numa_info[0].node_memdev == NULL)) {
+        static bool warned;
+        if (!warned) {
+            error_report("Huge page support disabled (n/a for main memory).");
+            warned = true;
+        }
+        return mainrampagesize;
+    }
+
+    return hpsize;
+}
+#else
+long qemu_getrampagesize(void)
+{
+    return getpagesize();
+}
+#endif
+
+#ifdef __linux__
 static int64_t get_file_size(int fd)
 {
     int64_t size = lseek(fd, 0, SEEK_END);
diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index 6e91a4d8bb..e0abffa8ad 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -42,6 +42,7 @@
 #include "trace.h"
 #include "exec/gdbstub.h"
 #include "exec/memattrs.h"
+#include "exec/ram_addr.h"
 #include "sysemu/hostmem.h"
 #include "qemu/cutils.h"
 #if defined(TARGET_PPC64)
@@ -325,106 +326,6 @@ static void kvm_get_smmu_info(PowerPCCPU *cpu, struct kvm_ppc_smmu_info *info)
     kvm_get_fallback_smmu_info(cpu, info);
 }
 
-static long gethugepagesize(const char *mem_path)
-{
-    struct statfs fs;
-    int ret;
-
-    do {
-        ret = statfs(mem_path, &fs);
-    } while (ret != 0 && errno == EINTR);
-
-    if (ret != 0) {
-        fprintf(stderr, "Couldn't statfs() memory path: %s\n",
-                strerror(errno));
-        exit(1);
-    }
-
-#define HUGETLBFS_MAGIC       0x958458f6
-
-    if (fs.f_type != HUGETLBFS_MAGIC) {
-        /* Explicit mempath, but it's ordinary pages */
-        return getpagesize();
-    }
-
-    /* It's hugepage, return the huge page size */
-    return fs.f_bsize;
-}
-
-/*
- * FIXME TOCTTOU: this iterates over memory backends' mem-path, which
- * may or may not name the same files / on the same filesystem now as
- * when we actually open and map them.  Iterate over the file
- * descriptors instead, and use qemu_fd_getpagesize().
- */
-static int find_max_supported_pagesize(Object *obj, void *opaque)
-{
-    char *mem_path;
-    long *hpsize_min = opaque;
-
-    if (object_dynamic_cast(obj, TYPE_MEMORY_BACKEND)) {
-        mem_path = object_property_get_str(obj, "mem-path", NULL);
-        if (mem_path) {
-            long hpsize = gethugepagesize(mem_path);
-            if (hpsize < *hpsize_min) {
-                *hpsize_min = hpsize;
-            }
-        } else {
-            *hpsize_min = getpagesize();
-        }
-    }
-
-    return 0;
-}
-
-static long getrampagesize(void)
-{
-    long hpsize = LONG_MAX;
-    long mainrampagesize;
-    Object *memdev_root;
-
-    if (mem_path) {
-        mainrampagesize = gethugepagesize(mem_path);
-    } else {
-        mainrampagesize = getpagesize();
-    }
-
-    /* it's possible we have memory-backend objects with
-     * hugepage-backed RAM. these may get mapped into system
-     * address space via -numa parameters or memory hotplug
-     * hooks. we want to take these into account, but we
-     * also want to make sure these supported hugepage
-     * sizes are applicable across the entire range of memory
-     * we may boot from, so we take the min across all
-     * backends, and assume normal pages in cases where a
-     * backend isn't backed by hugepages.
-     */
-    memdev_root = object_resolve_path("/objects", NULL);
-    if (memdev_root) {
-        object_child_foreach(memdev_root, find_max_supported_pagesize, &hpsize);
-    }
-    if (hpsize == LONG_MAX) {
-        /* No additional memory regions found ==> Report main RAM page size */
-        return mainrampagesize;
-    }
-
-    /* If NUMA is disabled or the NUMA nodes are not backed with a
-     * memory-backend, then there is at least one node using "normal" RAM,
-     * so if its page size is smaller we have got to report that size instead.
-     */
-    if (hpsize > mainrampagesize &&
-        (nb_numa_nodes == 0 || numa_info[0].node_memdev == NULL)) {
-        static bool warned;
-        if (!warned) {
-            error_report("Huge page support disabled (n/a for main memory).");
-            warned = true;
-        }
-        return mainrampagesize;
-    }
-
-    return hpsize;
-}
-
 static bool kvm_valid_page_size(uint32_t flags, long rampgsize, uint32_t shift)
 {
     if (!(flags & KVM_PPC_PAGE_SIZES_REAL)) {
@@ -454,7 +355,7 @@ static void kvm_fixup_page_sizes(PowerPCCPU *cpu)
         has_smmu_info = true;
     }
 
-    rampagesize = getrampagesize();
+    rampagesize = qemu_getrampagesize();
 
     /* Convert to QEMU form */
     memset(&env->sps, 0, sizeof(env->sps));
@@ -2177,7 +2078,7 @@ uint64_t kvmppc_rma_size(uint64_t current_size, unsigned int hash_shift)
     /* Find the largest hardware supported page size that's less than
      * or equal to the (logical) backing page size of guest RAM */
     kvm_get_smmu_info(POWERPC_CPU(first_cpu), &info);
-    rampagesize = getrampagesize();
+    rampagesize = qemu_getrampagesize();
     best_page_shift = 0;
 
     for (i = 0; i < KVM_PPC_PAGE_SIZES_MAX_SZ; i++) {
diff --git a/util/mmap-alloc.c b/util/mmap-alloc.c
index 5a85aa3c89..564c79109c 100644
--- a/util/mmap-alloc.c
+++ b/util/mmap-alloc.c
@@ -39,6 +39,31 @@ size_t qemu_fd_getpagesize(int fd)
     return getpagesize();
 }
 
+size_t qemu_mempath_getpagesize(const char *mem_path)
+{
+#ifdef CONFIG_LINUX
+    struct statfs fs;
+    int ret;
+
+    do {
+        ret = statfs(mem_path, &fs);
+    } while (ret != 0 && errno == EINTR);
+
+    if (ret != 0) {
+        fprintf(stderr, "Couldn't statfs() memory path: %s\n",
+                strerror(errno));
+        exit(1);
+    }
+
+    if (fs.f_type == HUGETLBFS_MAGIC) {
+        /* It's hugepage, return the huge page size */
+        return fs.f_bsize;
+    }
+#endif
+
+    return getpagesize();
+}
+
 void *qemu_ram_mmap(int fd, size_t size, size_t align, bool shared)
 {
     /*
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH qemu 2/2] spapr_pci: Advertise 16M IOMMU pages when available
  2016-12-22  5:22 [Qemu-devel] [PATCH qemu 0/2] exec, spapr_pci: Advertise huge IOMMU pages Alexey Kardashevskiy
  2016-12-22  5:22 ` [Qemu-devel] [PATCH qemu 1/2] exec, kvm, target-ppc: Move getrampagesize() to common code Alexey Kardashevskiy
@ 2016-12-22  5:22 ` Alexey Kardashevskiy
  2017-01-02 23:41   ` David Gibson
  1 sibling, 1 reply; 13+ messages in thread
From: Alexey Kardashevskiy @ 2016-12-22  5:22 UTC (permalink / raw)
  To: qemu-devel; +Cc: Alexey Kardashevskiy, qemu-ppc, David Gibson

On sPAPR, IOMMU page size varies and if QEMU is running with RAM
backed with hugepages, we can advertise this to the guest so does
this patch.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
 hw/ppc/spapr_pci.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index fd6fc1d953..09244056fc 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -1505,6 +1505,9 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
     }
 
     /* DMA setup */
+    /* This allows huge pages for IOMMU when guest is backed with huge pages */
+    sphb->page_size_mask |= qemu_getrampagesize();
+
     for (i = 0; i < windows_supported; ++i) {
         tcet = spapr_tce_new_table(DEVICE(sphb), sphb->dma_liobn[i]);
         if (!tcet) {
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH qemu 1/2] exec, kvm, target-ppc: Move getrampagesize() to common code
  2016-12-22  5:22 ` [Qemu-devel] [PATCH qemu 1/2] exec, kvm, target-ppc: Move getrampagesize() to common code Alexey Kardashevskiy
@ 2017-01-02 23:34   ` David Gibson
  2017-02-09  5:43     ` Alexey Kardashevskiy
  0 siblings, 1 reply; 13+ messages in thread
From: David Gibson @ 2017-01-02 23:34 UTC (permalink / raw)
  To: Alexey Kardashevskiy; +Cc: qemu-devel, qemu-ppc, pbonzini

[-- Attachment #1: Type: text/plain, Size: 11484 bytes --]

On Thu, Dec 22, 2016 at 04:22:11PM +1100, Alexey Kardashevskiy wrote:
> getrampagesize() returns the largest supported page size and mainly
> used to know if huge pages are enabled.
> 
> However is implemented in target-ppc/kvm.c and not available
> in TCG or other architectures.
> 
> This renames and moves gethugepagesize() to mmap-alloc.c where
> fd-based analog of it is already implemented. This renames and moves
> getrampagesize() to exec.c as it seems to be the common place for
> helpers like this.
> 
> This first user for it is going to be a spapr-pci-host-bridge which
> needs to know the largest RAM page size so the guest could try
> using bigger IOMMU pages to save memory.
> 
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

Seems sensible to me, but I'm not comfortable merging this via my tree
since it touches such core code.  Probably should go via Paolo.

> ---
>  include/exec/ram_addr.h   |   1 +
>  include/qemu/mmap-alloc.h |   2 +
>  exec.c                    |  82 ++++++++++++++++++++++++++++++++++++
>  target-ppc/kvm.c          | 105 ++--------------------------------------------
>  util/mmap-alloc.c         |  25 +++++++++++
>  5 files changed, 113 insertions(+), 102 deletions(-)
> 
> diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
> index 54d7108a9e..3935cbcfcd 100644
> --- a/include/exec/ram_addr.h
> +++ b/include/exec/ram_addr.h
> @@ -91,6 +91,7 @@ typedef struct RAMList {
>  } RAMList;
>  extern RAMList ram_list;
>  
> +long qemu_getrampagesize(void);
>  ram_addr_t last_ram_offset(void);
>  void qemu_mutex_lock_ramlist(void);
>  void qemu_mutex_unlock_ramlist(void);
> diff --git a/include/qemu/mmap-alloc.h b/include/qemu/mmap-alloc.h
> index 933c024ac5..50385e3f81 100644
> --- a/include/qemu/mmap-alloc.h
> +++ b/include/qemu/mmap-alloc.h
> @@ -5,6 +5,8 @@
>  
>  size_t qemu_fd_getpagesize(int fd);
>  
> +size_t qemu_mempath_getpagesize(const char *mem_path);
> +
>  void *qemu_ram_mmap(int fd, size_t size, size_t align, bool shared);
>  
>  void qemu_ram_munmap(void *ptr, size_t size);
> diff --git a/exec.c b/exec.c
> index 08c558eecf..d73b477a70 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -32,6 +32,7 @@
>  #endif
>  #include "sysemu/kvm.h"
>  #include "sysemu/sysemu.h"
> +#include "sysemu/numa.h"
>  #include "qemu/timer.h"
>  #include "qemu/config-file.h"
>  #include "qemu/error-report.h"
> @@ -1218,6 +1219,87 @@ void qemu_mutex_unlock_ramlist(void)
>  }
>  
>  #ifdef __linux__
> +/*
> + * FIXME TOCTTOU: this iterates over memory backends' mem-path, which
> + * may or may not name the same files / on the same filesystem now as
> + * when we actually open and map them.  Iterate over the file
> + * descriptors instead, and use qemu_fd_getpagesize().
> + */
> +static int find_max_supported_pagesize(Object *obj, void *opaque)
> +{
> +    char *mem_path;
> +    long *hpsize_min = opaque;
> +
> +    if (object_dynamic_cast(obj, TYPE_MEMORY_BACKEND)) {
> +        mem_path = object_property_get_str(obj, "mem-path", NULL);
> +        if (mem_path) {
> +            long hpsize = qemu_mempath_getpagesize(mem_path);
> +            if (hpsize < *hpsize_min) {
> +                *hpsize_min = hpsize;
> +            }
> +        } else {
> +            *hpsize_min = getpagesize();
> +        }
> +    }
> +
> +    return 0;
> +}
> +
> +long qemu_getrampagesize(void)
> +{
> +    long hpsize = LONG_MAX;
> +    long mainrampagesize;
> +    Object *memdev_root;
> +
> +    if (mem_path) {
> +        mainrampagesize = qemu_mempath_getpagesize(mem_path);
> +    } else {
> +        mainrampagesize = getpagesize();
> +    }
> +
> +    /* it's possible we have memory-backend objects with
> +     * hugepage-backed RAM. these may get mapped into system
> +     * address space via -numa parameters or memory hotplug
> +     * hooks. we want to take these into account, but we
> +     * also want to make sure these supported hugepage
> +     * sizes are applicable across the entire range of memory
> +     * we may boot from, so we take the min across all
> +     * backends, and assume normal pages in cases where a
> +     * backend isn't backed by hugepages.
> +     */
> +    memdev_root = object_resolve_path("/objects", NULL);
> +    if (memdev_root) {
> +        object_child_foreach(memdev_root, find_max_supported_pagesize, &hpsize);
> +    }
> +    if (hpsize == LONG_MAX) {
> +        /* No additional memory regions found ==> Report main RAM page size */
> +        return mainrampagesize;
> +    }
> +
> +    /* If NUMA is disabled or the NUMA nodes are not backed with a
> +     * memory-backend, then there is at least one node using "normal" RAM,
> +     * so if its page size is smaller we have got to report that size instead.
> +     */
> +    if (hpsize > mainrampagesize &&
> +        (nb_numa_nodes == 0 || numa_info[0].node_memdev == NULL)) {
> +        static bool warned;
> +        if (!warned) {
> +            error_report("Huge page support disabled (n/a for main memory).");
> +            warned = true;
> +        }
> +        return mainrampagesize;
> +    }
> +
> +    return hpsize;
> +}
> +#else
> +long qemu_getrampagesize(void)
> +{
> +    return getpagesize();
> +}
> +#endif
> +
> +#ifdef __linux__
>  static int64_t get_file_size(int fd)
>  {
>      int64_t size = lseek(fd, 0, SEEK_END);
> diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
> index 6e91a4d8bb..e0abffa8ad 100644
> --- a/target-ppc/kvm.c
> +++ b/target-ppc/kvm.c
> @@ -42,6 +42,7 @@
>  #include "trace.h"
>  #include "exec/gdbstub.h"
>  #include "exec/memattrs.h"
> +#include "exec/ram_addr.h"
>  #include "sysemu/hostmem.h"
>  #include "qemu/cutils.h"
>  #if defined(TARGET_PPC64)
> @@ -325,106 +326,6 @@ static void kvm_get_smmu_info(PowerPCCPU *cpu, struct kvm_ppc_smmu_info *info)
>      kvm_get_fallback_smmu_info(cpu, info);
>  }
>  
> -static long gethugepagesize(const char *mem_path)
> -{
> -    struct statfs fs;
> -    int ret;
> -
> -    do {
> -        ret = statfs(mem_path, &fs);
> -    } while (ret != 0 && errno == EINTR);
> -
> -    if (ret != 0) {
> -        fprintf(stderr, "Couldn't statfs() memory path: %s\n",
> -                strerror(errno));
> -        exit(1);
> -    }
> -
> -#define HUGETLBFS_MAGIC       0x958458f6
> -
> -    if (fs.f_type != HUGETLBFS_MAGIC) {
> -        /* Explicit mempath, but it's ordinary pages */
> -        return getpagesize();
> -    }
> -
> -    /* It's hugepage, return the huge page size */
> -    return fs.f_bsize;
> -}
> -
> -/*
> - * FIXME TOCTTOU: this iterates over memory backends' mem-path, which
> - * may or may not name the same files / on the same filesystem now as
> - * when we actually open and map them.  Iterate over the file
> - * descriptors instead, and use qemu_fd_getpagesize().
> - */
> -static int find_max_supported_pagesize(Object *obj, void *opaque)
> -{
> -    char *mem_path;
> -    long *hpsize_min = opaque;
> -
> -    if (object_dynamic_cast(obj, TYPE_MEMORY_BACKEND)) {
> -        mem_path = object_property_get_str(obj, "mem-path", NULL);
> -        if (mem_path) {
> -            long hpsize = gethugepagesize(mem_path);
> -            if (hpsize < *hpsize_min) {
> -                *hpsize_min = hpsize;
> -            }
> -        } else {
> -            *hpsize_min = getpagesize();
> -        }
> -    }
> -
> -    return 0;
> -}
> -
> -static long getrampagesize(void)
> -{
> -    long hpsize = LONG_MAX;
> -    long mainrampagesize;
> -    Object *memdev_root;
> -
> -    if (mem_path) {
> -        mainrampagesize = gethugepagesize(mem_path);
> -    } else {
> -        mainrampagesize = getpagesize();
> -    }
> -
> -    /* it's possible we have memory-backend objects with
> -     * hugepage-backed RAM. these may get mapped into system
> -     * address space via -numa parameters or memory hotplug
> -     * hooks. we want to take these into account, but we
> -     * also want to make sure these supported hugepage
> -     * sizes are applicable across the entire range of memory
> -     * we may boot from, so we take the min across all
> -     * backends, and assume normal pages in cases where a
> -     * backend isn't backed by hugepages.
> -     */
> -    memdev_root = object_resolve_path("/objects", NULL);
> -    if (memdev_root) {
> -        object_child_foreach(memdev_root, find_max_supported_pagesize, &hpsize);
> -    }
> -    if (hpsize == LONG_MAX) {
> -        /* No additional memory regions found ==> Report main RAM page size */
> -        return mainrampagesize;
> -    }
> -
> -    /* If NUMA is disabled or the NUMA nodes are not backed with a
> -     * memory-backend, then there is at least one node using "normal" RAM,
> -     * so if its page size is smaller we have got to report that size instead.
> -     */
> -    if (hpsize > mainrampagesize &&
> -        (nb_numa_nodes == 0 || numa_info[0].node_memdev == NULL)) {
> -        static bool warned;
> -        if (!warned) {
> -            error_report("Huge page support disabled (n/a for main memory).");
> -            warned = true;
> -        }
> -        return mainrampagesize;
> -    }
> -
> -    return hpsize;
> -}
> -
>  static bool kvm_valid_page_size(uint32_t flags, long rampgsize, uint32_t shift)
>  {
>      if (!(flags & KVM_PPC_PAGE_SIZES_REAL)) {
> @@ -454,7 +355,7 @@ static void kvm_fixup_page_sizes(PowerPCCPU *cpu)
>          has_smmu_info = true;
>      }
>  
> -    rampagesize = getrampagesize();
> +    rampagesize = qemu_getrampagesize();
>  
>      /* Convert to QEMU form */
>      memset(&env->sps, 0, sizeof(env->sps));
> @@ -2177,7 +2078,7 @@ uint64_t kvmppc_rma_size(uint64_t current_size, unsigned int hash_shift)
>      /* Find the largest hardware supported page size that's less than
>       * or equal to the (logical) backing page size of guest RAM */
>      kvm_get_smmu_info(POWERPC_CPU(first_cpu), &info);
> -    rampagesize = getrampagesize();
> +    rampagesize = qemu_getrampagesize();
>      best_page_shift = 0;
>  
>      for (i = 0; i < KVM_PPC_PAGE_SIZES_MAX_SZ; i++) {
> diff --git a/util/mmap-alloc.c b/util/mmap-alloc.c
> index 5a85aa3c89..564c79109c 100644
> --- a/util/mmap-alloc.c
> +++ b/util/mmap-alloc.c
> @@ -39,6 +39,31 @@ size_t qemu_fd_getpagesize(int fd)
>      return getpagesize();
>  }
>  
> +size_t qemu_mempath_getpagesize(const char *mem_path)
> +{
> +#ifdef CONFIG_LINUX
> +    struct statfs fs;
> +    int ret;
> +
> +    do {
> +        ret = statfs(mem_path, &fs);
> +    } while (ret != 0 && errno == EINTR);
> +
> +    if (ret != 0) {
> +        fprintf(stderr, "Couldn't statfs() memory path: %s\n",
> +                strerror(errno));
> +        exit(1);
> +    }
> +
> +    if (fs.f_type == HUGETLBFS_MAGIC) {
> +        /* It's hugepage, return the huge page size */
> +        return fs.f_bsize;
> +    }
> +#endif
> +
> +    return getpagesize();
> +}
> +
>  void *qemu_ram_mmap(int fd, size_t size, size_t align, bool shared)
>  {
>      /*

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH qemu 2/2] spapr_pci: Advertise 16M IOMMU pages when available
  2016-12-22  5:22 ` [Qemu-devel] [PATCH qemu 2/2] spapr_pci: Advertise 16M IOMMU pages when available Alexey Kardashevskiy
@ 2017-01-02 23:41   ` David Gibson
  2017-01-09  2:06     ` Alexey Kardashevskiy
  0 siblings, 1 reply; 13+ messages in thread
From: David Gibson @ 2017-01-02 23:41 UTC (permalink / raw)
  To: Alexey Kardashevskiy; +Cc: qemu-devel, qemu-ppc

[-- Attachment #1: Type: text/plain, Size: 1833 bytes --]

On Thu, Dec 22, 2016 at 04:22:12PM +1100, Alexey Kardashevskiy wrote:
> On sPAPR, IOMMU page size varies and if QEMU is running with RAM
> backed with hugepages, we can advertise this to the guest so does
> this patch.
> 
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
>  hw/ppc/spapr_pci.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index fd6fc1d953..09244056fc 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -1505,6 +1505,9 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
>      }
>  
>      /* DMA setup */
> +    /* This allows huge pages for IOMMU when guest is backed with huge pages */
> +    sphb->page_size_mask |= qemu_getrampagesize();

This doesn't look right - you're unconditionally enabling the host ram
page size, regardless of anything else.  Instead the backing page size
should be used to filter out those sizes which are possible from the
list of those supported by the guest hardware.  This patch will give
particularly odd results if you ran it on x86 with hugepages for
example: it would advertise a 2M IOMMU page size, which could never
exist on native POWER.

Except... come to think of it, why is the backing RAM page size
relevant at all?  Or rather.. I think VFIO should be able to cope with
any guest IOMMU page size which is larger than the host ram page size
(although if it's much larger it could get expensive in the host
tables).  This case would already be routine for ppc64 on x86, where
the guest IOMMU page size is 64kiB, but the host page size is 4 kiB.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH qemu 2/2] spapr_pci: Advertise 16M IOMMU pages when available
  2017-01-02 23:41   ` David Gibson
@ 2017-01-09  2:06     ` Alexey Kardashevskiy
  2017-01-12  5:09       ` David Gibson
  0 siblings, 1 reply; 13+ messages in thread
From: Alexey Kardashevskiy @ 2017-01-09  2:06 UTC (permalink / raw)
  To: David Gibson; +Cc: qemu-devel, qemu-ppc

[-- Attachment #1: Type: text/plain, Size: 2391 bytes --]

On 03/01/17 10:41, David Gibson wrote:
> On Thu, Dec 22, 2016 at 04:22:12PM +1100, Alexey Kardashevskiy wrote:
>> On sPAPR, IOMMU page size varies and if QEMU is running with RAM
>> backed with hugepages, we can advertise this to the guest so does
>> this patch.
>>
>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>> ---
>>  hw/ppc/spapr_pci.c | 3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
>> index fd6fc1d953..09244056fc 100644
>> --- a/hw/ppc/spapr_pci.c
>> +++ b/hw/ppc/spapr_pci.c
>> @@ -1505,6 +1505,9 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
>>      }
>>  
>>      /* DMA setup */
>> +    /* This allows huge pages for IOMMU when guest is backed with huge pages */
>> +    sphb->page_size_mask |= qemu_getrampagesize();
> 
> This doesn't look right - you're unconditionally enabling the host ram
> page size, regardless of anything else.  Instead the backing page size
> should be used to filter out those sizes which are possible from the
> list of those supported by the guest hardware.  This patch will give
> particularly odd results if you ran it on x86 with hugepages for
> example: it would advertise a 2M IOMMU page size, which could never
> exist on native POWER.

Ok, I'll filter 16M out if passed to PHB and not supported by the host.


> Except... come to think of it, why is the backing RAM page size
> relevant at all? 

Because this is just an optimization/acceleration and I'd think the user
wants to know if it is actually accelerated or not. If I always allow 16M
pages, and QEMU is not backed with hugepages, then all H_PUT_TCE will go
via slow path and consume as much memory for TCE as without hugepages, and
it will only be visible to the user if TCE-tracepoints are enabled.

> Or rather.. I think VFIO should be able to cope with
> any guest IOMMU page size which is larger than the host ram page size

It could, I just do not see much benefit in it. pseries guest can negotiate
4k, 64k, 16m pages and this seems to cover everything we want, why would we
want to emulate IOMMU page size?

> (although if it's much larger it could get expensive in the host
> tables).  This case would already be routine for ppc64 on x86, where
> the guest IOMMU page size is 64kiB, but the host page size is 4 kiB.




-- 
Alexey


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 839 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH qemu 2/2] spapr_pci: Advertise 16M IOMMU pages when available
  2017-01-09  2:06     ` Alexey Kardashevskiy
@ 2017-01-12  5:09       ` David Gibson
  0 siblings, 0 replies; 13+ messages in thread
From: David Gibson @ 2017-01-12  5:09 UTC (permalink / raw)
  To: Alexey Kardashevskiy; +Cc: qemu-devel, qemu-ppc

[-- Attachment #1: Type: text/plain, Size: 2830 bytes --]

On Mon, Jan 09, 2017 at 01:06:03PM +1100, Alexey Kardashevskiy wrote:
> On 03/01/17 10:41, David Gibson wrote:
> > On Thu, Dec 22, 2016 at 04:22:12PM +1100, Alexey Kardashevskiy wrote:
> >> On sPAPR, IOMMU page size varies and if QEMU is running with RAM
> >> backed with hugepages, we can advertise this to the guest so does
> >> this patch.
> >>
> >> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> >> ---
> >>  hw/ppc/spapr_pci.c | 3 +++
> >>  1 file changed, 3 insertions(+)
> >>
> >> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> >> index fd6fc1d953..09244056fc 100644
> >> --- a/hw/ppc/spapr_pci.c
> >> +++ b/hw/ppc/spapr_pci.c
> >> @@ -1505,6 +1505,9 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
> >>      }
> >>  
> >>      /* DMA setup */
> >> +    /* This allows huge pages for IOMMU when guest is backed with huge pages */
> >> +    sphb->page_size_mask |= qemu_getrampagesize();
> > 
> > This doesn't look right - you're unconditionally enabling the host ram
> > page size, regardless of anything else.  Instead the backing page size
> > should be used to filter out those sizes which are possible from the
> > list of those supported by the guest hardware.  This patch will give
> > particularly odd results if you ran it on x86 with hugepages for
> > example: it would advertise a 2M IOMMU page size, which could never
> > exist on native POWER.
> 
> Ok, I'll filter 16M out if passed to PHB and not supported by the host.
> 
> 
> > Except... come to think of it, why is the backing RAM page size
> > relevant at all? 
> 
> Because this is just an optimization/acceleration and I'd think the user
> wants to know if it is actually accelerated or not. If I always allow 16M
> pages, and QEMU is not backed with hugepages, then all H_PUT_TCE will go
> via slow path and consume as much memory for TCE as without hugepages, and
> it will only be visible to the user if TCE-tracepoints are enabled.

Hm, ok, fair enough.

> > Or rather.. I think VFIO should be able to cope with
> > any guest IOMMU page size which is larger than the host ram page size
> 
> It could, I just do not see much benefit in it. pseries guest can negotiate
> 4k, 64k, 16m pages and this seems to cover everything we want, why would we
> want to emulate IOMMU page size?

Just for testing or debugging, I suppose.

> 
> > (although if it's much larger it could get expensive in the host
> > tables).  This case would already be routine for ppc64 on x86, where
> > the guest IOMMU page size is 64kiB, but the host page size is 4 kiB.
> 
> 
> 
> 




-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH qemu 1/2] exec, kvm, target-ppc: Move getrampagesize() to common code
  2017-01-02 23:34   ` David Gibson
@ 2017-02-09  5:43     ` Alexey Kardashevskiy
  2017-02-09 11:48       ` Paolo Bonzini
  0 siblings, 1 reply; 13+ messages in thread
From: Alexey Kardashevskiy @ 2017-02-09  5:43 UTC (permalink / raw)
  To: pbonzini; +Cc: David Gibson, qemu-devel, qemu-ppc

[-- Attachment #1: Type: text/plain, Size: 11706 bytes --]

On 03/01/17 10:34, David Gibson wrote:
> On Thu, Dec 22, 2016 at 04:22:11PM +1100, Alexey Kardashevskiy wrote:
>> getrampagesize() returns the largest supported page size and mainly
>> used to know if huge pages are enabled.
>>
>> However is implemented in target-ppc/kvm.c and not available
>> in TCG or other architectures.
>>
>> This renames and moves gethugepagesize() to mmap-alloc.c where
>> fd-based analog of it is already implemented. This renames and moves
>> getrampagesize() to exec.c as it seems to be the common place for
>> helpers like this.
>>
>> This first user for it is going to be a spapr-pci-host-bridge which
>> needs to know the largest RAM page size so the guest could try
>> using bigger IOMMU pages to save memory.
>>
>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> 
> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
> 
> Seems sensible to me, but I'm not comfortable merging this via my tree
> since it touches such core code.  Probably should go via Paolo.


Paolo, ping?



> 
>> ---
>>  include/exec/ram_addr.h   |   1 +
>>  include/qemu/mmap-alloc.h |   2 +
>>  exec.c                    |  82 ++++++++++++++++++++++++++++++++++++
>>  target-ppc/kvm.c          | 105 ++--------------------------------------------
>>  util/mmap-alloc.c         |  25 +++++++++++
>>  5 files changed, 113 insertions(+), 102 deletions(-)
>>
>> diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
>> index 54d7108a9e..3935cbcfcd 100644
>> --- a/include/exec/ram_addr.h
>> +++ b/include/exec/ram_addr.h
>> @@ -91,6 +91,7 @@ typedef struct RAMList {
>>  } RAMList;
>>  extern RAMList ram_list;
>>  
>> +long qemu_getrampagesize(void);
>>  ram_addr_t last_ram_offset(void);
>>  void qemu_mutex_lock_ramlist(void);
>>  void qemu_mutex_unlock_ramlist(void);
>> diff --git a/include/qemu/mmap-alloc.h b/include/qemu/mmap-alloc.h
>> index 933c024ac5..50385e3f81 100644
>> --- a/include/qemu/mmap-alloc.h
>> +++ b/include/qemu/mmap-alloc.h
>> @@ -5,6 +5,8 @@
>>  
>>  size_t qemu_fd_getpagesize(int fd);
>>  
>> +size_t qemu_mempath_getpagesize(const char *mem_path);
>> +
>>  void *qemu_ram_mmap(int fd, size_t size, size_t align, bool shared);
>>  
>>  void qemu_ram_munmap(void *ptr, size_t size);
>> diff --git a/exec.c b/exec.c
>> index 08c558eecf..d73b477a70 100644
>> --- a/exec.c
>> +++ b/exec.c
>> @@ -32,6 +32,7 @@
>>  #endif
>>  #include "sysemu/kvm.h"
>>  #include "sysemu/sysemu.h"
>> +#include "sysemu/numa.h"
>>  #include "qemu/timer.h"
>>  #include "qemu/config-file.h"
>>  #include "qemu/error-report.h"
>> @@ -1218,6 +1219,87 @@ void qemu_mutex_unlock_ramlist(void)
>>  }
>>  
>>  #ifdef __linux__
>> +/*
>> + * FIXME TOCTTOU: this iterates over memory backends' mem-path, which
>> + * may or may not name the same files / on the same filesystem now as
>> + * when we actually open and map them.  Iterate over the file
>> + * descriptors instead, and use qemu_fd_getpagesize().
>> + */
>> +static int find_max_supported_pagesize(Object *obj, void *opaque)
>> +{
>> +    char *mem_path;
>> +    long *hpsize_min = opaque;
>> +
>> +    if (object_dynamic_cast(obj, TYPE_MEMORY_BACKEND)) {
>> +        mem_path = object_property_get_str(obj, "mem-path", NULL);
>> +        if (mem_path) {
>> +            long hpsize = qemu_mempath_getpagesize(mem_path);
>> +            if (hpsize < *hpsize_min) {
>> +                *hpsize_min = hpsize;
>> +            }
>> +        } else {
>> +            *hpsize_min = getpagesize();
>> +        }
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>> +long qemu_getrampagesize(void)
>> +{
>> +    long hpsize = LONG_MAX;
>> +    long mainrampagesize;
>> +    Object *memdev_root;
>> +
>> +    if (mem_path) {
>> +        mainrampagesize = qemu_mempath_getpagesize(mem_path);
>> +    } else {
>> +        mainrampagesize = getpagesize();
>> +    }
>> +
>> +    /* it's possible we have memory-backend objects with
>> +     * hugepage-backed RAM. these may get mapped into system
>> +     * address space via -numa parameters or memory hotplug
>> +     * hooks. we want to take these into account, but we
>> +     * also want to make sure these supported hugepage
>> +     * sizes are applicable across the entire range of memory
>> +     * we may boot from, so we take the min across all
>> +     * backends, and assume normal pages in cases where a
>> +     * backend isn't backed by hugepages.
>> +     */
>> +    memdev_root = object_resolve_path("/objects", NULL);
>> +    if (memdev_root) {
>> +        object_child_foreach(memdev_root, find_max_supported_pagesize, &hpsize);
>> +    }
>> +    if (hpsize == LONG_MAX) {
>> +        /* No additional memory regions found ==> Report main RAM page size */
>> +        return mainrampagesize;
>> +    }
>> +
>> +    /* If NUMA is disabled or the NUMA nodes are not backed with a
>> +     * memory-backend, then there is at least one node using "normal" RAM,
>> +     * so if its page size is smaller we have got to report that size instead.
>> +     */
>> +    if (hpsize > mainrampagesize &&
>> +        (nb_numa_nodes == 0 || numa_info[0].node_memdev == NULL)) {
>> +        static bool warned;
>> +        if (!warned) {
>> +            error_report("Huge page support disabled (n/a for main memory).");
>> +            warned = true;
>> +        }
>> +        return mainrampagesize;
>> +    }
>> +
>> +    return hpsize;
>> +}
>> +#else
>> +long qemu_getrampagesize(void)
>> +{
>> +    return getpagesize();
>> +}
>> +#endif
>> +
>> +#ifdef __linux__
>>  static int64_t get_file_size(int fd)
>>  {
>>      int64_t size = lseek(fd, 0, SEEK_END);
>> diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
>> index 6e91a4d8bb..e0abffa8ad 100644
>> --- a/target-ppc/kvm.c
>> +++ b/target-ppc/kvm.c
>> @@ -42,6 +42,7 @@
>>  #include "trace.h"
>>  #include "exec/gdbstub.h"
>>  #include "exec/memattrs.h"
>> +#include "exec/ram_addr.h"
>>  #include "sysemu/hostmem.h"
>>  #include "qemu/cutils.h"
>>  #if defined(TARGET_PPC64)
>> @@ -325,106 +326,6 @@ static void kvm_get_smmu_info(PowerPCCPU *cpu, struct kvm_ppc_smmu_info *info)
>>      kvm_get_fallback_smmu_info(cpu, info);
>>  }
>>  
>> -static long gethugepagesize(const char *mem_path)
>> -{
>> -    struct statfs fs;
>> -    int ret;
>> -
>> -    do {
>> -        ret = statfs(mem_path, &fs);
>> -    } while (ret != 0 && errno == EINTR);
>> -
>> -    if (ret != 0) {
>> -        fprintf(stderr, "Couldn't statfs() memory path: %s\n",
>> -                strerror(errno));
>> -        exit(1);
>> -    }
>> -
>> -#define HUGETLBFS_MAGIC       0x958458f6
>> -
>> -    if (fs.f_type != HUGETLBFS_MAGIC) {
>> -        /* Explicit mempath, but it's ordinary pages */
>> -        return getpagesize();
>> -    }
>> -
>> -    /* It's hugepage, return the huge page size */
>> -    return fs.f_bsize;
>> -}
>> -
>> -/*
>> - * FIXME TOCTTOU: this iterates over memory backends' mem-path, which
>> - * may or may not name the same files / on the same filesystem now as
>> - * when we actually open and map them.  Iterate over the file
>> - * descriptors instead, and use qemu_fd_getpagesize().
>> - */
>> -static int find_max_supported_pagesize(Object *obj, void *opaque)
>> -{
>> -    char *mem_path;
>> -    long *hpsize_min = opaque;
>> -
>> -    if (object_dynamic_cast(obj, TYPE_MEMORY_BACKEND)) {
>> -        mem_path = object_property_get_str(obj, "mem-path", NULL);
>> -        if (mem_path) {
>> -            long hpsize = gethugepagesize(mem_path);
>> -            if (hpsize < *hpsize_min) {
>> -                *hpsize_min = hpsize;
>> -            }
>> -        } else {
>> -            *hpsize_min = getpagesize();
>> -        }
>> -    }
>> -
>> -    return 0;
>> -}
>> -
>> -static long getrampagesize(void)
>> -{
>> -    long hpsize = LONG_MAX;
>> -    long mainrampagesize;
>> -    Object *memdev_root;
>> -
>> -    if (mem_path) {
>> -        mainrampagesize = gethugepagesize(mem_path);
>> -    } else {
>> -        mainrampagesize = getpagesize();
>> -    }
>> -
>> -    /* it's possible we have memory-backend objects with
>> -     * hugepage-backed RAM. these may get mapped into system
>> -     * address space via -numa parameters or memory hotplug
>> -     * hooks. we want to take these into account, but we
>> -     * also want to make sure these supported hugepage
>> -     * sizes are applicable across the entire range of memory
>> -     * we may boot from, so we take the min across all
>> -     * backends, and assume normal pages in cases where a
>> -     * backend isn't backed by hugepages.
>> -     */
>> -    memdev_root = object_resolve_path("/objects", NULL);
>> -    if (memdev_root) {
>> -        object_child_foreach(memdev_root, find_max_supported_pagesize, &hpsize);
>> -    }
>> -    if (hpsize == LONG_MAX) {
>> -        /* No additional memory regions found ==> Report main RAM page size */
>> -        return mainrampagesize;
>> -    }
>> -
>> -    /* If NUMA is disabled or the NUMA nodes are not backed with a
>> -     * memory-backend, then there is at least one node using "normal" RAM,
>> -     * so if its page size is smaller we have got to report that size instead.
>> -     */
>> -    if (hpsize > mainrampagesize &&
>> -        (nb_numa_nodes == 0 || numa_info[0].node_memdev == NULL)) {
>> -        static bool warned;
>> -        if (!warned) {
>> -            error_report("Huge page support disabled (n/a for main memory).");
>> -            warned = true;
>> -        }
>> -        return mainrampagesize;
>> -    }
>> -
>> -    return hpsize;
>> -}
>> -
>>  static bool kvm_valid_page_size(uint32_t flags, long rampgsize, uint32_t shift)
>>  {
>>      if (!(flags & KVM_PPC_PAGE_SIZES_REAL)) {
>> @@ -454,7 +355,7 @@ static void kvm_fixup_page_sizes(PowerPCCPU *cpu)
>>          has_smmu_info = true;
>>      }
>>  
>> -    rampagesize = getrampagesize();
>> +    rampagesize = qemu_getrampagesize();
>>  
>>      /* Convert to QEMU form */
>>      memset(&env->sps, 0, sizeof(env->sps));
>> @@ -2177,7 +2078,7 @@ uint64_t kvmppc_rma_size(uint64_t current_size, unsigned int hash_shift)
>>      /* Find the largest hardware supported page size that's less than
>>       * or equal to the (logical) backing page size of guest RAM */
>>      kvm_get_smmu_info(POWERPC_CPU(first_cpu), &info);
>> -    rampagesize = getrampagesize();
>> +    rampagesize = qemu_getrampagesize();
>>      best_page_shift = 0;
>>  
>>      for (i = 0; i < KVM_PPC_PAGE_SIZES_MAX_SZ; i++) {
>> diff --git a/util/mmap-alloc.c b/util/mmap-alloc.c
>> index 5a85aa3c89..564c79109c 100644
>> --- a/util/mmap-alloc.c
>> +++ b/util/mmap-alloc.c
>> @@ -39,6 +39,31 @@ size_t qemu_fd_getpagesize(int fd)
>>      return getpagesize();
>>  }
>>  
>> +size_t qemu_mempath_getpagesize(const char *mem_path)
>> +{
>> +#ifdef CONFIG_LINUX
>> +    struct statfs fs;
>> +    int ret;
>> +
>> +    do {
>> +        ret = statfs(mem_path, &fs);
>> +    } while (ret != 0 && errno == EINTR);
>> +
>> +    if (ret != 0) {
>> +        fprintf(stderr, "Couldn't statfs() memory path: %s\n",
>> +                strerror(errno));
>> +        exit(1);
>> +    }
>> +
>> +    if (fs.f_type == HUGETLBFS_MAGIC) {
>> +        /* It's hugepage, return the huge page size */
>> +        return fs.f_bsize;
>> +    }
>> +#endif
>> +
>> +    return getpagesize();
>> +}
>> +
>>  void *qemu_ram_mmap(int fd, size_t size, size_t align, bool shared)
>>  {
>>      /*
> 


-- 
Alexey


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 839 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH qemu 1/2] exec, kvm, target-ppc: Move getrampagesize() to common code
  2017-02-09  5:43     ` Alexey Kardashevskiy
@ 2017-02-09 11:48       ` Paolo Bonzini
  2017-02-10  0:41         ` David Gibson
  0 siblings, 1 reply; 13+ messages in thread
From: Paolo Bonzini @ 2017-02-09 11:48 UTC (permalink / raw)
  To: Alexey Kardashevskiy; +Cc: David Gibson, qemu-devel, qemu-ppc

[-- Attachment #1: Type: text/plain, Size: 12132 bytes --]



On 09/02/2017 06:43, Alexey Kardashevskiy wrote:
> On 03/01/17 10:34, David Gibson wrote:
>> On Thu, Dec 22, 2016 at 04:22:11PM +1100, Alexey Kardashevskiy wrote:
>>> getrampagesize() returns the largest supported page size and mainly
>>> used to know if huge pages are enabled.
>>>
>>> However is implemented in target-ppc/kvm.c and not available
>>> in TCG or other architectures.
>>>
>>> This renames and moves gethugepagesize() to mmap-alloc.c where
>>> fd-based analog of it is already implemented. This renames and moves
>>> getrampagesize() to exec.c as it seems to be the common place for
>>> helpers like this.
>>>
>>> This first user for it is going to be a spapr-pci-host-bridge which
>>> needs to know the largest RAM page size so the guest could try
>>> using bigger IOMMU pages to save memory.
>>>
>>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>>
>> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
>>
>> Seems sensible to me, but I'm not comfortable merging this via my tree
>> since it touches such core code.  Probably should go via Paolo.
> 
> Paolo, ping?

It's just code movement, go ahead.

Paolo

> 
> 
>>
>>> ---
>>>  include/exec/ram_addr.h   |   1 +
>>>  include/qemu/mmap-alloc.h |   2 +
>>>  exec.c                    |  82 ++++++++++++++++++++++++++++++++++++
>>>  target-ppc/kvm.c          | 105 ++--------------------------------------------
>>>  util/mmap-alloc.c         |  25 +++++++++++
>>>  5 files changed, 113 insertions(+), 102 deletions(-)
>>>
>>> diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
>>> index 54d7108a9e..3935cbcfcd 100644
>>> --- a/include/exec/ram_addr.h
>>> +++ b/include/exec/ram_addr.h
>>> @@ -91,6 +91,7 @@ typedef struct RAMList {
>>>  } RAMList;
>>>  extern RAMList ram_list;
>>>  
>>> +long qemu_getrampagesize(void);
>>>  ram_addr_t last_ram_offset(void);
>>>  void qemu_mutex_lock_ramlist(void);
>>>  void qemu_mutex_unlock_ramlist(void);
>>> diff --git a/include/qemu/mmap-alloc.h b/include/qemu/mmap-alloc.h
>>> index 933c024ac5..50385e3f81 100644
>>> --- a/include/qemu/mmap-alloc.h
>>> +++ b/include/qemu/mmap-alloc.h
>>> @@ -5,6 +5,8 @@
>>>  
>>>  size_t qemu_fd_getpagesize(int fd);
>>>  
>>> +size_t qemu_mempath_getpagesize(const char *mem_path);
>>> +
>>>  void *qemu_ram_mmap(int fd, size_t size, size_t align, bool shared);
>>>  
>>>  void qemu_ram_munmap(void *ptr, size_t size);
>>> diff --git a/exec.c b/exec.c
>>> index 08c558eecf..d73b477a70 100644
>>> --- a/exec.c
>>> +++ b/exec.c
>>> @@ -32,6 +32,7 @@
>>>  #endif
>>>  #include "sysemu/kvm.h"
>>>  #include "sysemu/sysemu.h"
>>> +#include "sysemu/numa.h"
>>>  #include "qemu/timer.h"
>>>  #include "qemu/config-file.h"
>>>  #include "qemu/error-report.h"
>>> @@ -1218,6 +1219,87 @@ void qemu_mutex_unlock_ramlist(void)
>>>  }
>>>  
>>>  #ifdef __linux__
>>> +/*
>>> + * FIXME TOCTTOU: this iterates over memory backends' mem-path, which
>>> + * may or may not name the same files / on the same filesystem now as
>>> + * when we actually open and map them.  Iterate over the file
>>> + * descriptors instead, and use qemu_fd_getpagesize().
>>> + */
>>> +static int find_max_supported_pagesize(Object *obj, void *opaque)
>>> +{
>>> +    char *mem_path;
>>> +    long *hpsize_min = opaque;
>>> +
>>> +    if (object_dynamic_cast(obj, TYPE_MEMORY_BACKEND)) {
>>> +        mem_path = object_property_get_str(obj, "mem-path", NULL);
>>> +        if (mem_path) {
>>> +            long hpsize = qemu_mempath_getpagesize(mem_path);
>>> +            if (hpsize < *hpsize_min) {
>>> +                *hpsize_min = hpsize;
>>> +            }
>>> +        } else {
>>> +            *hpsize_min = getpagesize();
>>> +        }
>>> +    }
>>> +
>>> +    return 0;
>>> +}
>>> +
>>> +long qemu_getrampagesize(void)
>>> +{
>>> +    long hpsize = LONG_MAX;
>>> +    long mainrampagesize;
>>> +    Object *memdev_root;
>>> +
>>> +    if (mem_path) {
>>> +        mainrampagesize = qemu_mempath_getpagesize(mem_path);
>>> +    } else {
>>> +        mainrampagesize = getpagesize();
>>> +    }
>>> +
>>> +    /* it's possible we have memory-backend objects with
>>> +     * hugepage-backed RAM. these may get mapped into system
>>> +     * address space via -numa parameters or memory hotplug
>>> +     * hooks. we want to take these into account, but we
>>> +     * also want to make sure these supported hugepage
>>> +     * sizes are applicable across the entire range of memory
>>> +     * we may boot from, so we take the min across all
>>> +     * backends, and assume normal pages in cases where a
>>> +     * backend isn't backed by hugepages.
>>> +     */
>>> +    memdev_root = object_resolve_path("/objects", NULL);
>>> +    if (memdev_root) {
>>> +        object_child_foreach(memdev_root, find_max_supported_pagesize, &hpsize);
>>> +    }
>>> +    if (hpsize == LONG_MAX) {
>>> +        /* No additional memory regions found ==> Report main RAM page size */
>>> +        return mainrampagesize;
>>> +    }
>>> +
>>> +    /* If NUMA is disabled or the NUMA nodes are not backed with a
>>> +     * memory-backend, then there is at least one node using "normal" RAM,
>>> +     * so if its page size is smaller we have got to report that size instead.
>>> +     */
>>> +    if (hpsize > mainrampagesize &&
>>> +        (nb_numa_nodes == 0 || numa_info[0].node_memdev == NULL)) {
>>> +        static bool warned;
>>> +        if (!warned) {
>>> +            error_report("Huge page support disabled (n/a for main memory).");
>>> +            warned = true;
>>> +        }
>>> +        return mainrampagesize;
>>> +    }
>>> +
>>> +    return hpsize;
>>> +}
>>> +#else
>>> +long qemu_getrampagesize(void)
>>> +{
>>> +    return getpagesize();
>>> +}
>>> +#endif
>>> +
>>> +#ifdef __linux__
>>>  static int64_t get_file_size(int fd)
>>>  {
>>>      int64_t size = lseek(fd, 0, SEEK_END);
>>> diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
>>> index 6e91a4d8bb..e0abffa8ad 100644
>>> --- a/target-ppc/kvm.c
>>> +++ b/target-ppc/kvm.c
>>> @@ -42,6 +42,7 @@
>>>  #include "trace.h"
>>>  #include "exec/gdbstub.h"
>>>  #include "exec/memattrs.h"
>>> +#include "exec/ram_addr.h"
>>>  #include "sysemu/hostmem.h"
>>>  #include "qemu/cutils.h"
>>>  #if defined(TARGET_PPC64)
>>> @@ -325,106 +326,6 @@ static void kvm_get_smmu_info(PowerPCCPU *cpu, struct kvm_ppc_smmu_info *info)
>>>      kvm_get_fallback_smmu_info(cpu, info);
>>>  }
>>>  
>>> -static long gethugepagesize(const char *mem_path)
>>> -{
>>> -    struct statfs fs;
>>> -    int ret;
>>> -
>>> -    do {
>>> -        ret = statfs(mem_path, &fs);
>>> -    } while (ret != 0 && errno == EINTR);
>>> -
>>> -    if (ret != 0) {
>>> -        fprintf(stderr, "Couldn't statfs() memory path: %s\n",
>>> -                strerror(errno));
>>> -        exit(1);
>>> -    }
>>> -
>>> -#define HUGETLBFS_MAGIC       0x958458f6
>>> -
>>> -    if (fs.f_type != HUGETLBFS_MAGIC) {
>>> -        /* Explicit mempath, but it's ordinary pages */
>>> -        return getpagesize();
>>> -    }
>>> -
>>> -    /* It's hugepage, return the huge page size */
>>> -    return fs.f_bsize;
>>> -}
>>> -
>>> -/*
>>> - * FIXME TOCTTOU: this iterates over memory backends' mem-path, which
>>> - * may or may not name the same files / on the same filesystem now as
>>> - * when we actually open and map them.  Iterate over the file
>>> - * descriptors instead, and use qemu_fd_getpagesize().
>>> - */
>>> -static int find_max_supported_pagesize(Object *obj, void *opaque)
>>> -{
>>> -    char *mem_path;
>>> -    long *hpsize_min = opaque;
>>> -
>>> -    if (object_dynamic_cast(obj, TYPE_MEMORY_BACKEND)) {
>>> -        mem_path = object_property_get_str(obj, "mem-path", NULL);
>>> -        if (mem_path) {
>>> -            long hpsize = gethugepagesize(mem_path);
>>> -            if (hpsize < *hpsize_min) {
>>> -                *hpsize_min = hpsize;
>>> -            }
>>> -        } else {
>>> -            *hpsize_min = getpagesize();
>>> -        }
>>> -    }
>>> -
>>> -    return 0;
>>> -}
>>> -
>>> -static long getrampagesize(void)
>>> -{
>>> -    long hpsize = LONG_MAX;
>>> -    long mainrampagesize;
>>> -    Object *memdev_root;
>>> -
>>> -    if (mem_path) {
>>> -        mainrampagesize = gethugepagesize(mem_path);
>>> -    } else {
>>> -        mainrampagesize = getpagesize();
>>> -    }
>>> -
>>> -    /* it's possible we have memory-backend objects with
>>> -     * hugepage-backed RAM. these may get mapped into system
>>> -     * address space via -numa parameters or memory hotplug
>>> -     * hooks. we want to take these into account, but we
>>> -     * also want to make sure these supported hugepage
>>> -     * sizes are applicable across the entire range of memory
>>> -     * we may boot from, so we take the min across all
>>> -     * backends, and assume normal pages in cases where a
>>> -     * backend isn't backed by hugepages.
>>> -     */
>>> -    memdev_root = object_resolve_path("/objects", NULL);
>>> -    if (memdev_root) {
>>> -        object_child_foreach(memdev_root, find_max_supported_pagesize, &hpsize);
>>> -    }
>>> -    if (hpsize == LONG_MAX) {
>>> -        /* No additional memory regions found ==> Report main RAM page size */
>>> -        return mainrampagesize;
>>> -    }
>>> -
>>> -    /* If NUMA is disabled or the NUMA nodes are not backed with a
>>> -     * memory-backend, then there is at least one node using "normal" RAM,
>>> -     * so if its page size is smaller we have got to report that size instead.
>>> -     */
>>> -    if (hpsize > mainrampagesize &&
>>> -        (nb_numa_nodes == 0 || numa_info[0].node_memdev == NULL)) {
>>> -        static bool warned;
>>> -        if (!warned) {
>>> -            error_report("Huge page support disabled (n/a for main memory).");
>>> -            warned = true;
>>> -        }
>>> -        return mainrampagesize;
>>> -    }
>>> -
>>> -    return hpsize;
>>> -}
>>> -
>>>  static bool kvm_valid_page_size(uint32_t flags, long rampgsize, uint32_t shift)
>>>  {
>>>      if (!(flags & KVM_PPC_PAGE_SIZES_REAL)) {
>>> @@ -454,7 +355,7 @@ static void kvm_fixup_page_sizes(PowerPCCPU *cpu)
>>>          has_smmu_info = true;
>>>      }
>>>  
>>> -    rampagesize = getrampagesize();
>>> +    rampagesize = qemu_getrampagesize();
>>>  
>>>      /* Convert to QEMU form */
>>>      memset(&env->sps, 0, sizeof(env->sps));
>>> @@ -2177,7 +2078,7 @@ uint64_t kvmppc_rma_size(uint64_t current_size, unsigned int hash_shift)
>>>      /* Find the largest hardware supported page size that's less than
>>>       * or equal to the (logical) backing page size of guest RAM */
>>>      kvm_get_smmu_info(POWERPC_CPU(first_cpu), &info);
>>> -    rampagesize = getrampagesize();
>>> +    rampagesize = qemu_getrampagesize();
>>>      best_page_shift = 0;
>>>  
>>>      for (i = 0; i < KVM_PPC_PAGE_SIZES_MAX_SZ; i++) {
>>> diff --git a/util/mmap-alloc.c b/util/mmap-alloc.c
>>> index 5a85aa3c89..564c79109c 100644
>>> --- a/util/mmap-alloc.c
>>> +++ b/util/mmap-alloc.c
>>> @@ -39,6 +39,31 @@ size_t qemu_fd_getpagesize(int fd)
>>>      return getpagesize();
>>>  }
>>>  
>>> +size_t qemu_mempath_getpagesize(const char *mem_path)
>>> +{
>>> +#ifdef CONFIG_LINUX
>>> +    struct statfs fs;
>>> +    int ret;
>>> +
>>> +    do {
>>> +        ret = statfs(mem_path, &fs);
>>> +    } while (ret != 0 && errno == EINTR);
>>> +
>>> +    if (ret != 0) {
>>> +        fprintf(stderr, "Couldn't statfs() memory path: %s\n",
>>> +                strerror(errno));
>>> +        exit(1);
>>> +    }
>>> +
>>> +    if (fs.f_type == HUGETLBFS_MAGIC) {
>>> +        /* It's hugepage, return the huge page size */
>>> +        return fs.f_bsize;
>>> +    }
>>> +#endif
>>> +
>>> +    return getpagesize();
>>> +}
>>> +
>>>  void *qemu_ram_mmap(int fd, size_t size, size_t align, bool shared)
>>>  {
>>>      /*
>>
> 
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH qemu 1/2] exec, kvm, target-ppc: Move getrampagesize() to common code
  2017-02-09 11:48       ` Paolo Bonzini
@ 2017-02-10  0:41         ` David Gibson
  2017-02-28  8:12           ` Alexey Kardashevskiy
  0 siblings, 1 reply; 13+ messages in thread
From: David Gibson @ 2017-02-10  0:41 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Alexey Kardashevskiy, qemu-devel, qemu-ppc

[-- Attachment #1: Type: text/plain, Size: 13130 bytes --]

On Thu, Feb 09, 2017 at 12:48:19PM +0100, Paolo Bonzini wrote:
> 
> 
> On 09/02/2017 06:43, Alexey Kardashevskiy wrote:
> > On 03/01/17 10:34, David Gibson wrote:
> >> On Thu, Dec 22, 2016 at 04:22:11PM +1100, Alexey Kardashevskiy wrote:
> >>> getrampagesize() returns the largest supported page size and mainly
> >>> used to know if huge pages are enabled.
> >>>
> >>> However is implemented in target-ppc/kvm.c and not available
> >>> in TCG or other architectures.
> >>>
> >>> This renames and moves gethugepagesize() to mmap-alloc.c where
> >>> fd-based analog of it is already implemented. This renames and moves
> >>> getrampagesize() to exec.c as it seems to be the common place for
> >>> helpers like this.
> >>>
> >>> This first user for it is going to be a spapr-pci-host-bridge which
> >>> needs to know the largest RAM page size so the guest could try
> >>> using bigger IOMMU pages to save memory.
> >>>
> >>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> >>
> >> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
> >>
> >> Seems sensible to me, but I'm not comfortable merging this via my tree
> >> since it touches such core code.  Probably should go via Paolo.
> > 
> > Paolo, ping?
> 
> It's just code movement, go ahead.

Ok, I've merged this in my tree.


> 
> Paolo
> 
> > 
> > 
> >>
> >>> ---
> >>>  include/exec/ram_addr.h   |   1 +
> >>>  include/qemu/mmap-alloc.h |   2 +
> >>>  exec.c                    |  82 ++++++++++++++++++++++++++++++++++++
> >>>  target-ppc/kvm.c          | 105 ++--------------------------------------------
> >>>  util/mmap-alloc.c         |  25 +++++++++++
> >>>  5 files changed, 113 insertions(+), 102 deletions(-)
> >>>
> >>> diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
> >>> index 54d7108a9e..3935cbcfcd 100644
> >>> --- a/include/exec/ram_addr.h
> >>> +++ b/include/exec/ram_addr.h
> >>> @@ -91,6 +91,7 @@ typedef struct RAMList {
> >>>  } RAMList;
> >>>  extern RAMList ram_list;
> >>>  
> >>> +long qemu_getrampagesize(void);
> >>>  ram_addr_t last_ram_offset(void);
> >>>  void qemu_mutex_lock_ramlist(void);
> >>>  void qemu_mutex_unlock_ramlist(void);
> >>> diff --git a/include/qemu/mmap-alloc.h b/include/qemu/mmap-alloc.h
> >>> index 933c024ac5..50385e3f81 100644
> >>> --- a/include/qemu/mmap-alloc.h
> >>> +++ b/include/qemu/mmap-alloc.h
> >>> @@ -5,6 +5,8 @@
> >>>  
> >>>  size_t qemu_fd_getpagesize(int fd);
> >>>  
> >>> +size_t qemu_mempath_getpagesize(const char *mem_path);
> >>> +
> >>>  void *qemu_ram_mmap(int fd, size_t size, size_t align, bool shared);
> >>>  
> >>>  void qemu_ram_munmap(void *ptr, size_t size);
> >>> diff --git a/exec.c b/exec.c
> >>> index 08c558eecf..d73b477a70 100644
> >>> --- a/exec.c
> >>> +++ b/exec.c
> >>> @@ -32,6 +32,7 @@
> >>>  #endif
> >>>  #include "sysemu/kvm.h"
> >>>  #include "sysemu/sysemu.h"
> >>> +#include "sysemu/numa.h"
> >>>  #include "qemu/timer.h"
> >>>  #include "qemu/config-file.h"
> >>>  #include "qemu/error-report.h"
> >>> @@ -1218,6 +1219,87 @@ void qemu_mutex_unlock_ramlist(void)
> >>>  }
> >>>  
> >>>  #ifdef __linux__
> >>> +/*
> >>> + * FIXME TOCTTOU: this iterates over memory backends' mem-path, which
> >>> + * may or may not name the same files / on the same filesystem now as
> >>> + * when we actually open and map them.  Iterate over the file
> >>> + * descriptors instead, and use qemu_fd_getpagesize().
> >>> + */
> >>> +static int find_max_supported_pagesize(Object *obj, void *opaque)
> >>> +{
> >>> +    char *mem_path;
> >>> +    long *hpsize_min = opaque;
> >>> +
> >>> +    if (object_dynamic_cast(obj, TYPE_MEMORY_BACKEND)) {
> >>> +        mem_path = object_property_get_str(obj, "mem-path", NULL);
> >>> +        if (mem_path) {
> >>> +            long hpsize = qemu_mempath_getpagesize(mem_path);
> >>> +            if (hpsize < *hpsize_min) {
> >>> +                *hpsize_min = hpsize;
> >>> +            }
> >>> +        } else {
> >>> +            *hpsize_min = getpagesize();
> >>> +        }
> >>> +    }
> >>> +
> >>> +    return 0;
> >>> +}
> >>> +
> >>> +long qemu_getrampagesize(void)
> >>> +{
> >>> +    long hpsize = LONG_MAX;
> >>> +    long mainrampagesize;
> >>> +    Object *memdev_root;
> >>> +
> >>> +    if (mem_path) {
> >>> +        mainrampagesize = qemu_mempath_getpagesize(mem_path);
> >>> +    } else {
> >>> +        mainrampagesize = getpagesize();
> >>> +    }
> >>> +
> >>> +    /* it's possible we have memory-backend objects with
> >>> +     * hugepage-backed RAM. these may get mapped into system
> >>> +     * address space via -numa parameters or memory hotplug
> >>> +     * hooks. we want to take these into account, but we
> >>> +     * also want to make sure these supported hugepage
> >>> +     * sizes are applicable across the entire range of memory
> >>> +     * we may boot from, so we take the min across all
> >>> +     * backends, and assume normal pages in cases where a
> >>> +     * backend isn't backed by hugepages.
> >>> +     */
> >>> +    memdev_root = object_resolve_path("/objects", NULL);
> >>> +    if (memdev_root) {
> >>> +        object_child_foreach(memdev_root, find_max_supported_pagesize, &hpsize);
> >>> +    }
> >>> +    if (hpsize == LONG_MAX) {
> >>> +        /* No additional memory regions found ==> Report main RAM page size */
> >>> +        return mainrampagesize;
> >>> +    }
> >>> +
> >>> +    /* If NUMA is disabled or the NUMA nodes are not backed with a
> >>> +     * memory-backend, then there is at least one node using "normal" RAM,
> >>> +     * so if its page size is smaller we have got to report that size instead.
> >>> +     */
> >>> +    if (hpsize > mainrampagesize &&
> >>> +        (nb_numa_nodes == 0 || numa_info[0].node_memdev == NULL)) {
> >>> +        static bool warned;
> >>> +        if (!warned) {
> >>> +            error_report("Huge page support disabled (n/a for main memory).");
> >>> +            warned = true;
> >>> +        }
> >>> +        return mainrampagesize;
> >>> +    }
> >>> +
> >>> +    return hpsize;
> >>> +}
> >>> +#else
> >>> +long qemu_getrampagesize(void)
> >>> +{
> >>> +    return getpagesize();
> >>> +}
> >>> +#endif
> >>> +
> >>> +#ifdef __linux__
> >>>  static int64_t get_file_size(int fd)
> >>>  {
> >>>      int64_t size = lseek(fd, 0, SEEK_END);
> >>> diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
> >>> index 6e91a4d8bb..e0abffa8ad 100644
> >>> --- a/target-ppc/kvm.c
> >>> +++ b/target-ppc/kvm.c
> >>> @@ -42,6 +42,7 @@
> >>>  #include "trace.h"
> >>>  #include "exec/gdbstub.h"
> >>>  #include "exec/memattrs.h"
> >>> +#include "exec/ram_addr.h"
> >>>  #include "sysemu/hostmem.h"
> >>>  #include "qemu/cutils.h"
> >>>  #if defined(TARGET_PPC64)
> >>> @@ -325,106 +326,6 @@ static void kvm_get_smmu_info(PowerPCCPU *cpu, struct kvm_ppc_smmu_info *info)
> >>>      kvm_get_fallback_smmu_info(cpu, info);
> >>>  }
> >>>  
> >>> -static long gethugepagesize(const char *mem_path)
> >>> -{
> >>> -    struct statfs fs;
> >>> -    int ret;
> >>> -
> >>> -    do {
> >>> -        ret = statfs(mem_path, &fs);
> >>> -    } while (ret != 0 && errno == EINTR);
> >>> -
> >>> -    if (ret != 0) {
> >>> -        fprintf(stderr, "Couldn't statfs() memory path: %s\n",
> >>> -                strerror(errno));
> >>> -        exit(1);
> >>> -    }
> >>> -
> >>> -#define HUGETLBFS_MAGIC       0x958458f6
> >>> -
> >>> -    if (fs.f_type != HUGETLBFS_MAGIC) {
> >>> -        /* Explicit mempath, but it's ordinary pages */
> >>> -        return getpagesize();
> >>> -    }
> >>> -
> >>> -    /* It's hugepage, return the huge page size */
> >>> -    return fs.f_bsize;
> >>> -}
> >>> -
> >>> -/*
> >>> - * FIXME TOCTTOU: this iterates over memory backends' mem-path, which
> >>> - * may or may not name the same files / on the same filesystem now as
> >>> - * when we actually open and map them.  Iterate over the file
> >>> - * descriptors instead, and use qemu_fd_getpagesize().
> >>> - */
> >>> -static int find_max_supported_pagesize(Object *obj, void *opaque)
> >>> -{
> >>> -    char *mem_path;
> >>> -    long *hpsize_min = opaque;
> >>> -
> >>> -    if (object_dynamic_cast(obj, TYPE_MEMORY_BACKEND)) {
> >>> -        mem_path = object_property_get_str(obj, "mem-path", NULL);
> >>> -        if (mem_path) {
> >>> -            long hpsize = gethugepagesize(mem_path);
> >>> -            if (hpsize < *hpsize_min) {
> >>> -                *hpsize_min = hpsize;
> >>> -            }
> >>> -        } else {
> >>> -            *hpsize_min = getpagesize();
> >>> -        }
> >>> -    }
> >>> -
> >>> -    return 0;
> >>> -}
> >>> -
> >>> -static long getrampagesize(void)
> >>> -{
> >>> -    long hpsize = LONG_MAX;
> >>> -    long mainrampagesize;
> >>> -    Object *memdev_root;
> >>> -
> >>> -    if (mem_path) {
> >>> -        mainrampagesize = gethugepagesize(mem_path);
> >>> -    } else {
> >>> -        mainrampagesize = getpagesize();
> >>> -    }
> >>> -
> >>> -    /* it's possible we have memory-backend objects with
> >>> -     * hugepage-backed RAM. these may get mapped into system
> >>> -     * address space via -numa parameters or memory hotplug
> >>> -     * hooks. we want to take these into account, but we
> >>> -     * also want to make sure these supported hugepage
> >>> -     * sizes are applicable across the entire range of memory
> >>> -     * we may boot from, so we take the min across all
> >>> -     * backends, and assume normal pages in cases where a
> >>> -     * backend isn't backed by hugepages.
> >>> -     */
> >>> -    memdev_root = object_resolve_path("/objects", NULL);
> >>> -    if (memdev_root) {
> >>> -        object_child_foreach(memdev_root, find_max_supported_pagesize, &hpsize);
> >>> -    }
> >>> -    if (hpsize == LONG_MAX) {
> >>> -        /* No additional memory regions found ==> Report main RAM page size */
> >>> -        return mainrampagesize;
> >>> -    }
> >>> -
> >>> -    /* If NUMA is disabled or the NUMA nodes are not backed with a
> >>> -     * memory-backend, then there is at least one node using "normal" RAM,
> >>> -     * so if its page size is smaller we have got to report that size instead.
> >>> -     */
> >>> -    if (hpsize > mainrampagesize &&
> >>> -        (nb_numa_nodes == 0 || numa_info[0].node_memdev == NULL)) {
> >>> -        static bool warned;
> >>> -        if (!warned) {
> >>> -            error_report("Huge page support disabled (n/a for main memory).");
> >>> -            warned = true;
> >>> -        }
> >>> -        return mainrampagesize;
> >>> -    }
> >>> -
> >>> -    return hpsize;
> >>> -}
> >>> -
> >>>  static bool kvm_valid_page_size(uint32_t flags, long rampgsize, uint32_t shift)
> >>>  {
> >>>      if (!(flags & KVM_PPC_PAGE_SIZES_REAL)) {
> >>> @@ -454,7 +355,7 @@ static void kvm_fixup_page_sizes(PowerPCCPU *cpu)
> >>>          has_smmu_info = true;
> >>>      }
> >>>  
> >>> -    rampagesize = getrampagesize();
> >>> +    rampagesize = qemu_getrampagesize();
> >>>  
> >>>      /* Convert to QEMU form */
> >>>      memset(&env->sps, 0, sizeof(env->sps));
> >>> @@ -2177,7 +2078,7 @@ uint64_t kvmppc_rma_size(uint64_t current_size, unsigned int hash_shift)
> >>>      /* Find the largest hardware supported page size that's less than
> >>>       * or equal to the (logical) backing page size of guest RAM */
> >>>      kvm_get_smmu_info(POWERPC_CPU(first_cpu), &info);
> >>> -    rampagesize = getrampagesize();
> >>> +    rampagesize = qemu_getrampagesize();
> >>>      best_page_shift = 0;
> >>>  
> >>>      for (i = 0; i < KVM_PPC_PAGE_SIZES_MAX_SZ; i++) {
> >>> diff --git a/util/mmap-alloc.c b/util/mmap-alloc.c
> >>> index 5a85aa3c89..564c79109c 100644
> >>> --- a/util/mmap-alloc.c
> >>> +++ b/util/mmap-alloc.c
> >>> @@ -39,6 +39,31 @@ size_t qemu_fd_getpagesize(int fd)
> >>>      return getpagesize();
> >>>  }
> >>>  
> >>> +size_t qemu_mempath_getpagesize(const char *mem_path)
> >>> +{
> >>> +#ifdef CONFIG_LINUX
> >>> +    struct statfs fs;
> >>> +    int ret;
> >>> +
> >>> +    do {
> >>> +        ret = statfs(mem_path, &fs);
> >>> +    } while (ret != 0 && errno == EINTR);
> >>> +
> >>> +    if (ret != 0) {
> >>> +        fprintf(stderr, "Couldn't statfs() memory path: %s\n",
> >>> +                strerror(errno));
> >>> +        exit(1);
> >>> +    }
> >>> +
> >>> +    if (fs.f_type == HUGETLBFS_MAGIC) {
> >>> +        /* It's hugepage, return the huge page size */
> >>> +        return fs.f_bsize;
> >>> +    }
> >>> +#endif
> >>> +
> >>> +    return getpagesize();
> >>> +}
> >>> +
> >>>  void *qemu_ram_mmap(int fd, size_t size, size_t align, bool shared)
> >>>  {
> >>>      /*
> >>
> > 
> > 
> 




-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH qemu 1/2] exec, kvm, target-ppc: Move getrampagesize() to common code
  2017-02-10  0:41         ` David Gibson
@ 2017-02-28  8:12           ` Alexey Kardashevskiy
  2017-03-01  1:06             ` David Gibson
  0 siblings, 1 reply; 13+ messages in thread
From: Alexey Kardashevskiy @ 2017-02-28  8:12 UTC (permalink / raw)
  To: David Gibson, Paolo Bonzini; +Cc: qemu-devel, qemu-ppc

[-- Attachment #1: Type: text/plain, Size: 13076 bytes --]

On 10/02/17 11:41, David Gibson wrote:
> On Thu, Feb 09, 2017 at 12:48:19PM +0100, Paolo Bonzini wrote:
>>
>>
>> On 09/02/2017 06:43, Alexey Kardashevskiy wrote:
>>> On 03/01/17 10:34, David Gibson wrote:
>>>> On Thu, Dec 22, 2016 at 04:22:11PM +1100, Alexey Kardashevskiy wrote:
>>>>> getrampagesize() returns the largest supported page size and mainly
>>>>> used to know if huge pages are enabled.
>>>>>
>>>>> However is implemented in target-ppc/kvm.c and not available
>>>>> in TCG or other architectures.
>>>>>
>>>>> This renames and moves gethugepagesize() to mmap-alloc.c where
>>>>> fd-based analog of it is already implemented. This renames and moves
>>>>> getrampagesize() to exec.c as it seems to be the common place for
>>>>> helpers like this.
>>>>>
>>>>> This first user for it is going to be a spapr-pci-host-bridge which
>>>>> needs to know the largest RAM page size so the guest could try
>>>>> using bigger IOMMU pages to save memory.
>>>>>
>>>>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>>>>
>>>> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
>>>>
>>>> Seems sensible to me, but I'm not comfortable merging this via my tree
>>>> since it touches such core code.  Probably should go via Paolo.
>>>
>>> Paolo, ping?
>>
>> It's just code movement, go ahead.
> 
> Ok, I've merged this in my tree.


I cannot find it in any public tree, where did it go?



> 
> 
>>
>> Paolo
>>
>>>
>>>
>>>>
>>>>> ---
>>>>>  include/exec/ram_addr.h   |   1 +
>>>>>  include/qemu/mmap-alloc.h |   2 +
>>>>>  exec.c                    |  82 ++++++++++++++++++++++++++++++++++++
>>>>>  target-ppc/kvm.c          | 105 ++--------------------------------------------
>>>>>  util/mmap-alloc.c         |  25 +++++++++++
>>>>>  5 files changed, 113 insertions(+), 102 deletions(-)
>>>>>
>>>>> diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
>>>>> index 54d7108a9e..3935cbcfcd 100644
>>>>> --- a/include/exec/ram_addr.h
>>>>> +++ b/include/exec/ram_addr.h
>>>>> @@ -91,6 +91,7 @@ typedef struct RAMList {
>>>>>  } RAMList;
>>>>>  extern RAMList ram_list;
>>>>>  
>>>>> +long qemu_getrampagesize(void);
>>>>>  ram_addr_t last_ram_offset(void);
>>>>>  void qemu_mutex_lock_ramlist(void);
>>>>>  void qemu_mutex_unlock_ramlist(void);
>>>>> diff --git a/include/qemu/mmap-alloc.h b/include/qemu/mmap-alloc.h
>>>>> index 933c024ac5..50385e3f81 100644
>>>>> --- a/include/qemu/mmap-alloc.h
>>>>> +++ b/include/qemu/mmap-alloc.h
>>>>> @@ -5,6 +5,8 @@
>>>>>  
>>>>>  size_t qemu_fd_getpagesize(int fd);
>>>>>  
>>>>> +size_t qemu_mempath_getpagesize(const char *mem_path);
>>>>> +
>>>>>  void *qemu_ram_mmap(int fd, size_t size, size_t align, bool shared);
>>>>>  
>>>>>  void qemu_ram_munmap(void *ptr, size_t size);
>>>>> diff --git a/exec.c b/exec.c
>>>>> index 08c558eecf..d73b477a70 100644
>>>>> --- a/exec.c
>>>>> +++ b/exec.c
>>>>> @@ -32,6 +32,7 @@
>>>>>  #endif
>>>>>  #include "sysemu/kvm.h"
>>>>>  #include "sysemu/sysemu.h"
>>>>> +#include "sysemu/numa.h"
>>>>>  #include "qemu/timer.h"
>>>>>  #include "qemu/config-file.h"
>>>>>  #include "qemu/error-report.h"
>>>>> @@ -1218,6 +1219,87 @@ void qemu_mutex_unlock_ramlist(void)
>>>>>  }
>>>>>  
>>>>>  #ifdef __linux__
>>>>> +/*
>>>>> + * FIXME TOCTTOU: this iterates over memory backends' mem-path, which
>>>>> + * may or may not name the same files / on the same filesystem now as
>>>>> + * when we actually open and map them.  Iterate over the file
>>>>> + * descriptors instead, and use qemu_fd_getpagesize().
>>>>> + */
>>>>> +static int find_max_supported_pagesize(Object *obj, void *opaque)
>>>>> +{
>>>>> +    char *mem_path;
>>>>> +    long *hpsize_min = opaque;
>>>>> +
>>>>> +    if (object_dynamic_cast(obj, TYPE_MEMORY_BACKEND)) {
>>>>> +        mem_path = object_property_get_str(obj, "mem-path", NULL);
>>>>> +        if (mem_path) {
>>>>> +            long hpsize = qemu_mempath_getpagesize(mem_path);
>>>>> +            if (hpsize < *hpsize_min) {
>>>>> +                *hpsize_min = hpsize;
>>>>> +            }
>>>>> +        } else {
>>>>> +            *hpsize_min = getpagesize();
>>>>> +        }
>>>>> +    }
>>>>> +
>>>>> +    return 0;
>>>>> +}
>>>>> +
>>>>> +long qemu_getrampagesize(void)
>>>>> +{
>>>>> +    long hpsize = LONG_MAX;
>>>>> +    long mainrampagesize;
>>>>> +    Object *memdev_root;
>>>>> +
>>>>> +    if (mem_path) {
>>>>> +        mainrampagesize = qemu_mempath_getpagesize(mem_path);
>>>>> +    } else {
>>>>> +        mainrampagesize = getpagesize();
>>>>> +    }
>>>>> +
>>>>> +    /* it's possible we have memory-backend objects with
>>>>> +     * hugepage-backed RAM. these may get mapped into system
>>>>> +     * address space via -numa parameters or memory hotplug
>>>>> +     * hooks. we want to take these into account, but we
>>>>> +     * also want to make sure these supported hugepage
>>>>> +     * sizes are applicable across the entire range of memory
>>>>> +     * we may boot from, so we take the min across all
>>>>> +     * backends, and assume normal pages in cases where a
>>>>> +     * backend isn't backed by hugepages.
>>>>> +     */
>>>>> +    memdev_root = object_resolve_path("/objects", NULL);
>>>>> +    if (memdev_root) {
>>>>> +        object_child_foreach(memdev_root, find_max_supported_pagesize, &hpsize);
>>>>> +    }
>>>>> +    if (hpsize == LONG_MAX) {
>>>>> +        /* No additional memory regions found ==> Report main RAM page size */
>>>>> +        return mainrampagesize;
>>>>> +    }
>>>>> +
>>>>> +    /* If NUMA is disabled or the NUMA nodes are not backed with a
>>>>> +     * memory-backend, then there is at least one node using "normal" RAM,
>>>>> +     * so if its page size is smaller we have got to report that size instead.
>>>>> +     */
>>>>> +    if (hpsize > mainrampagesize &&
>>>>> +        (nb_numa_nodes == 0 || numa_info[0].node_memdev == NULL)) {
>>>>> +        static bool warned;
>>>>> +        if (!warned) {
>>>>> +            error_report("Huge page support disabled (n/a for main memory).");
>>>>> +            warned = true;
>>>>> +        }
>>>>> +        return mainrampagesize;
>>>>> +    }
>>>>> +
>>>>> +    return hpsize;
>>>>> +}
>>>>> +#else
>>>>> +long qemu_getrampagesize(void)
>>>>> +{
>>>>> +    return getpagesize();
>>>>> +}
>>>>> +#endif
>>>>> +
>>>>> +#ifdef __linux__
>>>>>  static int64_t get_file_size(int fd)
>>>>>  {
>>>>>      int64_t size = lseek(fd, 0, SEEK_END);
>>>>> diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
>>>>> index 6e91a4d8bb..e0abffa8ad 100644
>>>>> --- a/target-ppc/kvm.c
>>>>> +++ b/target-ppc/kvm.c
>>>>> @@ -42,6 +42,7 @@
>>>>>  #include "trace.h"
>>>>>  #include "exec/gdbstub.h"
>>>>>  #include "exec/memattrs.h"
>>>>> +#include "exec/ram_addr.h"
>>>>>  #include "sysemu/hostmem.h"
>>>>>  #include "qemu/cutils.h"
>>>>>  #if defined(TARGET_PPC64)
>>>>> @@ -325,106 +326,6 @@ static void kvm_get_smmu_info(PowerPCCPU *cpu, struct kvm_ppc_smmu_info *info)
>>>>>      kvm_get_fallback_smmu_info(cpu, info);
>>>>>  }
>>>>>  
>>>>> -static long gethugepagesize(const char *mem_path)
>>>>> -{
>>>>> -    struct statfs fs;
>>>>> -    int ret;
>>>>> -
>>>>> -    do {
>>>>> -        ret = statfs(mem_path, &fs);
>>>>> -    } while (ret != 0 && errno == EINTR);
>>>>> -
>>>>> -    if (ret != 0) {
>>>>> -        fprintf(stderr, "Couldn't statfs() memory path: %s\n",
>>>>> -                strerror(errno));
>>>>> -        exit(1);
>>>>> -    }
>>>>> -
>>>>> -#define HUGETLBFS_MAGIC       0x958458f6
>>>>> -
>>>>> -    if (fs.f_type != HUGETLBFS_MAGIC) {
>>>>> -        /* Explicit mempath, but it's ordinary pages */
>>>>> -        return getpagesize();
>>>>> -    }
>>>>> -
>>>>> -    /* It's hugepage, return the huge page size */
>>>>> -    return fs.f_bsize;
>>>>> -}
>>>>> -
>>>>> -/*
>>>>> - * FIXME TOCTTOU: this iterates over memory backends' mem-path, which
>>>>> - * may or may not name the same files / on the same filesystem now as
>>>>> - * when we actually open and map them.  Iterate over the file
>>>>> - * descriptors instead, and use qemu_fd_getpagesize().
>>>>> - */
>>>>> -static int find_max_supported_pagesize(Object *obj, void *opaque)
>>>>> -{
>>>>> -    char *mem_path;
>>>>> -    long *hpsize_min = opaque;
>>>>> -
>>>>> -    if (object_dynamic_cast(obj, TYPE_MEMORY_BACKEND)) {
>>>>> -        mem_path = object_property_get_str(obj, "mem-path", NULL);
>>>>> -        if (mem_path) {
>>>>> -            long hpsize = gethugepagesize(mem_path);
>>>>> -            if (hpsize < *hpsize_min) {
>>>>> -                *hpsize_min = hpsize;
>>>>> -            }
>>>>> -        } else {
>>>>> -            *hpsize_min = getpagesize();
>>>>> -        }
>>>>> -    }
>>>>> -
>>>>> -    return 0;
>>>>> -}
>>>>> -
>>>>> -static long getrampagesize(void)
>>>>> -{
>>>>> -    long hpsize = LONG_MAX;
>>>>> -    long mainrampagesize;
>>>>> -    Object *memdev_root;
>>>>> -
>>>>> -    if (mem_path) {
>>>>> -        mainrampagesize = gethugepagesize(mem_path);
>>>>> -    } else {
>>>>> -        mainrampagesize = getpagesize();
>>>>> -    }
>>>>> -
>>>>> -    /* it's possible we have memory-backend objects with
>>>>> -     * hugepage-backed RAM. these may get mapped into system
>>>>> -     * address space via -numa parameters or memory hotplug
>>>>> -     * hooks. we want to take these into account, but we
>>>>> -     * also want to make sure these supported hugepage
>>>>> -     * sizes are applicable across the entire range of memory
>>>>> -     * we may boot from, so we take the min across all
>>>>> -     * backends, and assume normal pages in cases where a
>>>>> -     * backend isn't backed by hugepages.
>>>>> -     */
>>>>> -    memdev_root = object_resolve_path("/objects", NULL);
>>>>> -    if (memdev_root) {
>>>>> -        object_child_foreach(memdev_root, find_max_supported_pagesize, &hpsize);
>>>>> -    }
>>>>> -    if (hpsize == LONG_MAX) {
>>>>> -        /* No additional memory regions found ==> Report main RAM page size */
>>>>> -        return mainrampagesize;
>>>>> -    }
>>>>> -
>>>>> -    /* If NUMA is disabled or the NUMA nodes are not backed with a
>>>>> -     * memory-backend, then there is at least one node using "normal" RAM,
>>>>> -     * so if its page size is smaller we have got to report that size instead.
>>>>> -     */
>>>>> -    if (hpsize > mainrampagesize &&
>>>>> -        (nb_numa_nodes == 0 || numa_info[0].node_memdev == NULL)) {
>>>>> -        static bool warned;
>>>>> -        if (!warned) {
>>>>> -            error_report("Huge page support disabled (n/a for main memory).");
>>>>> -            warned = true;
>>>>> -        }
>>>>> -        return mainrampagesize;
>>>>> -    }
>>>>> -
>>>>> -    return hpsize;
>>>>> -}
>>>>> -
>>>>>  static bool kvm_valid_page_size(uint32_t flags, long rampgsize, uint32_t shift)
>>>>>  {
>>>>>      if (!(flags & KVM_PPC_PAGE_SIZES_REAL)) {
>>>>> @@ -454,7 +355,7 @@ static void kvm_fixup_page_sizes(PowerPCCPU *cpu)
>>>>>          has_smmu_info = true;
>>>>>      }
>>>>>  
>>>>> -    rampagesize = getrampagesize();
>>>>> +    rampagesize = qemu_getrampagesize();
>>>>>  
>>>>>      /* Convert to QEMU form */
>>>>>      memset(&env->sps, 0, sizeof(env->sps));
>>>>> @@ -2177,7 +2078,7 @@ uint64_t kvmppc_rma_size(uint64_t current_size, unsigned int hash_shift)
>>>>>      /* Find the largest hardware supported page size that's less than
>>>>>       * or equal to the (logical) backing page size of guest RAM */
>>>>>      kvm_get_smmu_info(POWERPC_CPU(first_cpu), &info);
>>>>> -    rampagesize = getrampagesize();
>>>>> +    rampagesize = qemu_getrampagesize();
>>>>>      best_page_shift = 0;
>>>>>  
>>>>>      for (i = 0; i < KVM_PPC_PAGE_SIZES_MAX_SZ; i++) {
>>>>> diff --git a/util/mmap-alloc.c b/util/mmap-alloc.c
>>>>> index 5a85aa3c89..564c79109c 100644
>>>>> --- a/util/mmap-alloc.c
>>>>> +++ b/util/mmap-alloc.c
>>>>> @@ -39,6 +39,31 @@ size_t qemu_fd_getpagesize(int fd)
>>>>>      return getpagesize();
>>>>>  }
>>>>>  
>>>>> +size_t qemu_mempath_getpagesize(const char *mem_path)
>>>>> +{
>>>>> +#ifdef CONFIG_LINUX
>>>>> +    struct statfs fs;
>>>>> +    int ret;
>>>>> +
>>>>> +    do {
>>>>> +        ret = statfs(mem_path, &fs);
>>>>> +    } while (ret != 0 && errno == EINTR);
>>>>> +
>>>>> +    if (ret != 0) {
>>>>> +        fprintf(stderr, "Couldn't statfs() memory path: %s\n",
>>>>> +                strerror(errno));
>>>>> +        exit(1);
>>>>> +    }
>>>>> +
>>>>> +    if (fs.f_type == HUGETLBFS_MAGIC) {
>>>>> +        /* It's hugepage, return the huge page size */
>>>>> +        return fs.f_bsize;
>>>>> +    }
>>>>> +#endif
>>>>> +
>>>>> +    return getpagesize();
>>>>> +}
>>>>> +
>>>>>  void *qemu_ram_mmap(int fd, size_t size, size_t align, bool shared)
>>>>>  {
>>>>>      /*
>>>>
>>>
>>>
>>
> 
> 
> 
> 


-- 
Alexey


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 839 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH qemu 1/2] exec, kvm, target-ppc: Move getrampagesize() to common code
  2017-02-28  8:12           ` Alexey Kardashevskiy
@ 2017-03-01  1:06             ` David Gibson
  2017-03-01  2:52               ` David Gibson
  0 siblings, 1 reply; 13+ messages in thread
From: David Gibson @ 2017-03-01  1:06 UTC (permalink / raw)
  To: Alexey Kardashevskiy; +Cc: Paolo Bonzini, qemu-devel, qemu-ppc

[-- Attachment #1: Type: text/plain, Size: 1869 bytes --]

On Tue, Feb 28, 2017 at 07:12:53PM +1100, Alexey Kardashevskiy wrote:
> On 10/02/17 11:41, David Gibson wrote:
> > On Thu, Feb 09, 2017 at 12:48:19PM +0100, Paolo Bonzini wrote:
> >>
> >>
> >> On 09/02/2017 06:43, Alexey Kardashevskiy wrote:
> >>> On 03/01/17 10:34, David Gibson wrote:
> >>>> On Thu, Dec 22, 2016 at 04:22:11PM +1100, Alexey Kardashevskiy wrote:
> >>>>> getrampagesize() returns the largest supported page size and mainly
> >>>>> used to know if huge pages are enabled.
> >>>>>
> >>>>> However is implemented in target-ppc/kvm.c and not available
> >>>>> in TCG or other architectures.
> >>>>>
> >>>>> This renames and moves gethugepagesize() to mmap-alloc.c where
> >>>>> fd-based analog of it is already implemented. This renames and moves
> >>>>> getrampagesize() to exec.c as it seems to be the common place for
> >>>>> helpers like this.
> >>>>>
> >>>>> This first user for it is going to be a spapr-pci-host-bridge which
> >>>>> needs to know the largest RAM page size so the guest could try
> >>>>> using bigger IOMMU pages to save memory.
> >>>>>
> >>>>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> >>>>
> >>>> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
> >>>>
> >>>> Seems sensible to me, but I'm not comfortable merging this via my tree
> >>>> since it touches such core code.  Probably should go via Paolo.
> >>>
> >>> Paolo, ping?
> >>
> >> It's just code movement, go ahead.
> > 
> > Ok, I've merged this in my tree.
> 
> 
> I cannot find it in any public tree, where did it go?

Bother, I thought I'd applied it, but I seem to have mislaid it.  Can
you resend it to me please.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH qemu 1/2] exec, kvm, target-ppc: Move getrampagesize() to common code
  2017-03-01  1:06             ` David Gibson
@ 2017-03-01  2:52               ` David Gibson
  0 siblings, 0 replies; 13+ messages in thread
From: David Gibson @ 2017-03-01  2:52 UTC (permalink / raw)
  To: Alexey Kardashevskiy; +Cc: Paolo Bonzini, qemu-devel, qemu-ppc

[-- Attachment #1: Type: text/plain, Size: 2203 bytes --]

On Wed, Mar 01, 2017 at 12:06:14PM +1100, David Gibson wrote:
> On Tue, Feb 28, 2017 at 07:12:53PM +1100, Alexey Kardashevskiy wrote:
> > On 10/02/17 11:41, David Gibson wrote:
> > > On Thu, Feb 09, 2017 at 12:48:19PM +0100, Paolo Bonzini wrote:
> > >>
> > >>
> > >> On 09/02/2017 06:43, Alexey Kardashevskiy wrote:
> > >>> On 03/01/17 10:34, David Gibson wrote:
> > >>>> On Thu, Dec 22, 2016 at 04:22:11PM +1100, Alexey Kardashevskiy wrote:
> > >>>>> getrampagesize() returns the largest supported page size and mainly
> > >>>>> used to know if huge pages are enabled.
> > >>>>>
> > >>>>> However is implemented in target-ppc/kvm.c and not available
> > >>>>> in TCG or other architectures.
> > >>>>>
> > >>>>> This renames and moves gethugepagesize() to mmap-alloc.c where
> > >>>>> fd-based analog of it is already implemented. This renames and moves
> > >>>>> getrampagesize() to exec.c as it seems to be the common place for
> > >>>>> helpers like this.
> > >>>>>
> > >>>>> This first user for it is going to be a spapr-pci-host-bridge which
> > >>>>> needs to know the largest RAM page size so the guest could try
> > >>>>> using bigger IOMMU pages to save memory.
> > >>>>>
> > >>>>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> > >>>>
> > >>>> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
> > >>>>
> > >>>> Seems sensible to me, but I'm not comfortable merging this via my tree
> > >>>> since it touches such core code.  Probably should go via Paolo.
> > >>>
> > >>> Paolo, ping?
> > >>
> > >> It's just code movement, go ahead.
> > > 
> > > Ok, I've merged this in my tree.
> > 
> > 
> > I cannot find it in any public tree, where did it go?
> 
> Bother, I thought I'd applied it, but I seem to have mislaid it.  Can
> you resend it to me please.

So, I manage to dig this up from my archives, but it doesn't seem to
apply properly on the current tree.  The actual merge conflicts were
easy to resolve, but then it doesn't compile.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2017-03-01  3:53 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-12-22  5:22 [Qemu-devel] [PATCH qemu 0/2] exec, spapr_pci: Advertise huge IOMMU pages Alexey Kardashevskiy
2016-12-22  5:22 ` [Qemu-devel] [PATCH qemu 1/2] exec, kvm, target-ppc: Move getrampagesize() to common code Alexey Kardashevskiy
2017-01-02 23:34   ` David Gibson
2017-02-09  5:43     ` Alexey Kardashevskiy
2017-02-09 11:48       ` Paolo Bonzini
2017-02-10  0:41         ` David Gibson
2017-02-28  8:12           ` Alexey Kardashevskiy
2017-03-01  1:06             ` David Gibson
2017-03-01  2:52               ` David Gibson
2016-12-22  5:22 ` [Qemu-devel] [PATCH qemu 2/2] spapr_pci: Advertise 16M IOMMU pages when available Alexey Kardashevskiy
2017-01-02 23:41   ` David Gibson
2017-01-09  2:06     ` Alexey Kardashevskiy
2017-01-12  5:09       ` David Gibson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).