qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Greg Kurz <groug@kaod.org>
Cc: qemu-devel@nongnu.org, Peter Maydell <peter.maydell@linaro.org>,
	Murilo Opsfelder Araujo <muriloo@linux.ibm.com>,
	Peter Crosthwaite <crosthwaite.peter@gmail.com>,
	Richard Henderson <rth@twiddle.net>,
	Paolo Bonzini <pbonzini@redhat.com>,
	David Gibson <david@gibson.dropbear.id.au>
Subject: Re: [Qemu-devel] [PULL 23/25] mmap-alloc: fix hugetlbfs misaligned length in ppc64
Date: Mon, 4 Feb 2019 10:20:28 -0500	[thread overview]
Message-ID: <20190204101958-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <20190204161554.253d810b@bahia.lan>

I see. Well git should have no trouble resolving this.


On Mon, Feb 04, 2019 at 04:15:54PM +0100, Greg Kurz wrote:
> Hi Michael,
> 
> These two patches (22 and 23) from Murilo already got merged with a pull request
> from David earlier today.
> 
> Cheers,
> 
> --
> Greg
> 
> On Mon, 4 Feb 2019 09:44:04 -0500
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
> > From: Murilo Opsfelder Araujo <muriloo@linux.ibm.com>
> > 
> > The commit 7197fb4058bcb68986bae2bb2c04d6370f3e7218 ("util/mmap-alloc:
> > fix hugetlb support on ppc64") fixed Huge TLB mappings on ppc64.
> > 
> > However, we still need to consider the underlying huge page size
> > during munmap() because it requires that both address and length be a
> > multiple of the underlying huge page size for Huge TLB mappings.
> > Quote from "Huge page (Huge TLB) mappings" paragraph under NOTES
> > section of the munmap(2) manual:
> > 
> >   "For munmap(), addr and length must both be a multiple of the
> >   underlying huge page size."
> > 
> > On ppc64, the munmap() in qemu_ram_munmap() does not work for Huge TLB
> > mappings because the mapped segment can be aligned with the underlying
> > huge page size, not aligned with the native system page size, as
> > returned by getpagesize().
> > 
> > This has the side effect of not releasing huge pages back to the pool
> > after a hugetlbfs file-backed memory device is hot-unplugged.
> > 
> > This patch fixes the situation in qemu_ram_mmap() and
> > qemu_ram_munmap() by considering the underlying page size on ppc64.
> > 
> > After this patch, memory hot-unplug releases huge pages back to the
> > pool.
> > 
> > Fixes: 7197fb4058bcb68986bae2bb2c04d6370f3e7218
> > Signed-off-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com>
> > Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > Reviewed-by: Greg Kurz <groug@kaod.org>
> > ---
> >  include/qemu/mmap-alloc.h |  2 +-
> >  exec.c                    |  4 ++--
> >  util/mmap-alloc.c         | 22 ++++++++++++++++------
> >  util/oslib-posix.c        |  2 +-
> >  4 files changed, 20 insertions(+), 10 deletions(-)
> > 
> > diff --git a/include/qemu/mmap-alloc.h b/include/qemu/mmap-alloc.h
> > index 50385e3f81..ef04f0ed5b 100644
> > --- a/include/qemu/mmap-alloc.h
> > +++ b/include/qemu/mmap-alloc.h
> > @@ -9,6 +9,6 @@ size_t qemu_mempath_getpagesize(const char *mem_path);
> >  
> >  void *qemu_ram_mmap(int fd, size_t size, size_t align, bool shared);
> >  
> > -void qemu_ram_munmap(void *ptr, size_t size);
> > +void qemu_ram_munmap(int fd, void *ptr, size_t size);
> >  
> >  #endif
> > diff --git a/exec.c b/exec.c
> > index 25f3938a27..03dd673d36 100644
> > --- a/exec.c
> > +++ b/exec.c
> > @@ -1873,7 +1873,7 @@ static void *file_ram_alloc(RAMBlock *block,
> >      if (mem_prealloc) {
> >          os_mem_prealloc(fd, area, memory, smp_cpus, errp);
> >          if (errp && *errp) {
> > -            qemu_ram_munmap(area, memory);
> > +            qemu_ram_munmap(fd, area, memory);
> >              return NULL;
> >          }
> >      }
> > @@ -2394,7 +2394,7 @@ static void reclaim_ramblock(RAMBlock *block)
> >          xen_invalidate_map_cache_entry(block->host);
> >  #ifndef _WIN32
> >      } else if (block->fd >= 0) {
> > -        qemu_ram_munmap(block->host, block->max_length);
> > +        qemu_ram_munmap(block->fd, block->host, block->max_length);
> >          close(block->fd);
> >  #endif
> >      } else {
> > diff --git a/util/mmap-alloc.c b/util/mmap-alloc.c
> > index f71ea038c8..8565885420 100644
> > --- a/util/mmap-alloc.c
> > +++ b/util/mmap-alloc.c
> > @@ -80,6 +80,7 @@ void *qemu_ram_mmap(int fd, size_t size, size_t align, bool shared)
> >      int flags;
> >      int guardfd;
> >      size_t offset;
> > +    size_t pagesize;
> >      size_t total;
> >      void *guardptr;
> >      void *ptr;
> > @@ -100,7 +101,8 @@ void *qemu_ram_mmap(int fd, size_t size, size_t align, bool shared)
> >       * anonymous memory is OK.
> >       */
> >      flags = MAP_PRIVATE;
> > -    if (fd == -1 || qemu_fd_getpagesize(fd) == getpagesize()) {
> > +    pagesize = qemu_fd_getpagesize(fd);
> > +    if (fd == -1 || pagesize == getpagesize()) {
> >          guardfd = -1;
> >          flags |= MAP_ANONYMOUS;
> >      } else {
> > @@ -109,6 +111,7 @@ void *qemu_ram_mmap(int fd, size_t size, size_t align, bool shared)
> >      }
> >  #else
> >      guardfd = -1;
> > +    pagesize = getpagesize();
> >      flags = MAP_PRIVATE | MAP_ANONYMOUS;
> >  #endif
> >  
> > @@ -120,7 +123,7 @@ void *qemu_ram_mmap(int fd, size_t size, size_t align, bool shared)
> >  
> >      assert(is_power_of_2(align));
> >      /* Always align to host page size */
> > -    assert(align >= getpagesize());
> > +    assert(align >= pagesize);
> >  
> >      flags = MAP_FIXED;
> >      flags |= fd == -1 ? MAP_ANONYMOUS : 0;
> > @@ -143,17 +146,24 @@ void *qemu_ram_mmap(int fd, size_t size, size_t align, bool shared)
> >       * a guard page guarding against potential buffer overflows.
> >       */
> >      total -= offset;
> > -    if (total > size + getpagesize()) {
> > -        munmap(ptr + size + getpagesize(), total - size - getpagesize());
> > +    if (total > size + pagesize) {
> > +        munmap(ptr + size + pagesize, total - size - pagesize);
> >      }
> >  
> >      return ptr;
> >  }
> >  
> > -void qemu_ram_munmap(void *ptr, size_t size)
> > +void qemu_ram_munmap(int fd, void *ptr, size_t size)
> >  {
> > +    size_t pagesize;
> > +
> >      if (ptr) {
> >          /* Unmap both the RAM block and the guard page */
> > -        munmap(ptr, size + getpagesize());
> > +#if defined(__powerpc64__) && defined(__linux__)
> > +        pagesize = qemu_fd_getpagesize(fd);
> > +#else
> > +        pagesize = getpagesize();
> > +#endif
> > +        munmap(ptr, size + pagesize);
> >      }
> >  }
> > diff --git a/util/oslib-posix.c b/util/oslib-posix.c
> > index 4ce1ba9ca4..37c5854b9c 100644
> > --- a/util/oslib-posix.c
> > +++ b/util/oslib-posix.c
> > @@ -226,7 +226,7 @@ void qemu_vfree(void *ptr)
> >  void qemu_anon_ram_free(void *ptr, size_t size)
> >  {
> >      trace_qemu_anon_ram_free(ptr, size);
> > -    qemu_ram_munmap(ptr, size);
> > +    qemu_ram_munmap(-1, ptr, size);
> >  }
> >  
> >  void qemu_set_block(int fd)

  reply	other threads:[~2019-02-04 15:20 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-04 14:43 [Qemu-devel] [PULL 00/25] pci, pc, virtio: fixes, cleanups, features Michael S. Tsirkin
2019-02-04 14:43 ` [Qemu-devel] [PULL 01/25] virtio: add checks for the size of the indirect table Michael S. Tsirkin
2019-02-04 14:43 ` [Qemu-devel] [PULL 02/25] contrib/libvhost-user: switch to uint64_t Michael S. Tsirkin
2019-02-04 14:43 ` [Qemu-devel] [PULL 03/25] scripts/update-linux-headers.sh: adjust for Linux 4.21-rc1 (or 5.0-rc1) Michael S. Tsirkin
2019-02-04 14:43 ` [Qemu-devel] [PULL 04/25] include: update Linux headers to 4.21-rc1/5.0-rc1 Michael S. Tsirkin
2019-02-04 14:43 ` [Qemu-devel] [PULL 10/25] hw: virtio-pci: drop DO_UPCAST Michael S. Tsirkin
2019-02-04 14:43 ` [Qemu-devel] [PULL 11/25] intel_iommu: fix operator in vtd_switch_address_space Michael S. Tsirkin
2019-02-04 14:43 ` [Qemu-devel] [PULL 12/25] intel_iommu: reset intr_enabled when system reset Michael S. Tsirkin
2019-02-04 14:43 ` [Qemu-devel] [PULL 13/25] pci/msi: export msi_is_masked() Michael S. Tsirkin
2019-02-04 14:43 ` [Qemu-devel] [PULL 14/25] i386/kvm: ignore masked irqs when update msi routes Michael S. Tsirkin
2019-02-04 14:43 ` [Qemu-devel] [PULL 15/25] contrib: compile vhost-user-blk tool by default Michael S. Tsirkin
2019-02-04 15:07   ` Daniel P. Berrangé
2019-02-04 15:19     ` Michael S. Tsirkin
2019-02-04 15:29       ` Daniel P. Berrangé
2019-02-05  1:48         ` Michael S. Tsirkin
2019-02-08  7:13           ` Stefan Hajnoczi
2019-02-04 14:43 ` [Qemu-devel] [PULL 16/25] contrib/vhost-user-blk: fix the compilation issue Michael S. Tsirkin
2019-02-04 14:43 ` [Qemu-devel] [PULL 17/25] vhost-user-blk: add discard/write zeroes features support Michael S. Tsirkin
2019-02-04 14:43 ` [Qemu-devel] [PULL 18/25] hw/virtio: Use CONFIG_VIRTIO_PCI switch instead of CONFIG_PCI Michael S. Tsirkin
2019-02-04 14:43 ` [Qemu-devel] [PULL 19/25] acpi: Make TPM 2.0 with TIS available as MSFT0101 Michael S. Tsirkin
2019-02-04 14:43 ` [Qemu-devel] [PULL 20/25] fw_cfg: fix the life cycle and the name of "qemu_extra_params_fw" Michael S. Tsirkin
2019-02-04 14:43 ` [Qemu-devel] [PULL 21/25] i386, acpi: cleanup build_facs by removing second unused argument Michael S. Tsirkin
2019-02-04 14:44 ` [Qemu-devel] [PULL 22/25] mmap-alloc: unfold qemu_ram_mmap() Michael S. Tsirkin
2019-02-04 14:44 ` [Qemu-devel] [PULL 23/25] mmap-alloc: fix hugetlbfs misaligned length in ppc64 Michael S. Tsirkin
2019-02-04 15:15   ` Greg Kurz
2019-02-04 15:20     ` Michael S. Tsirkin [this message]
2019-02-04 14:44 ` [Qemu-devel] [PULL 24/25] r2d: fix build on mingw Michael S. Tsirkin
2019-02-04 14:44 ` [Qemu-devel] [PULL 25/25] contrib/libvhost-user: cleanup casts Michael S. Tsirkin
2019-02-04 17:59 ` [Qemu-devel] [PULL 00/25] pci, pc, virtio: fixes, cleanups, features Peter Maydell
2019-02-04 19:39   ` Michael S. Tsirkin
2019-02-05  1:50   ` Michael S. Tsirkin
2019-02-05  1:51   ` Michael S. Tsirkin
2019-02-05 12:41     ` Peter Maydell
2019-02-05 16:06       ` Michael S. Tsirkin
2019-02-05 17:38         ` Peter Maydell
2019-02-12  7:11         ` Peter Xu
2019-02-12 10:39           ` Philippe Mathieu-Daudé
2019-02-12 13:04             ` Michael S. Tsirkin
2019-02-12 13:15               ` Philippe Mathieu-Daudé
2019-02-12 13:24                 ` Michael S. Tsirkin
2019-02-12 13:53                   ` Philippe Mathieu-Daudé
2019-02-12 14:04                     ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190204101958-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=crosthwaite.peter@gmail.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=groug@kaod.org \
    --cc=muriloo@linux.ibm.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=rth@twiddle.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).