qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert (git)" <dgilbert@redhat.com>
To: qemu-devel@nongnu.org, maxime.coquelin@redhat.com,
	marcandre.lureau@redhat.com, peterx@redhat.com,
	imammedo@redhat.com, mst@redhat.com
Cc: quintela@redhat.com, aarcange@redhat.com
Subject: [Qemu-devel] [PATCH v3 01/29] migrate: Update ram_block_discard_range for shared
Date: Fri, 16 Feb 2018 13:15:57 +0000	[thread overview]
Message-ID: <20180216131625.9639-2-dgilbert@redhat.com> (raw)
In-Reply-To: <20180216131625.9639-1-dgilbert@redhat.com>

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>

The choice of call to discard a block is getting more complicated
for other cases.   We use fallocate PUNCH_HOLE in any file cases;
it works for both hugepage and for tmpfs.
We use the DONTNEED for non-hugepage cases either where they're
anonymous or where they're private.

Care should be taken when trying other backing files.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 exec.c       | 60 ++++++++++++++++++++++++++++++++++++++++++++++--------------
 trace-events |  3 ++-
 2 files changed, 48 insertions(+), 15 deletions(-)

diff --git a/exec.c b/exec.c
index e8d7b335b6..b1bb477776 100644
--- a/exec.c
+++ b/exec.c
@@ -3702,6 +3702,7 @@ int ram_block_discard_range(RAMBlock *rb, uint64_t start, size_t length)
     }
 
     if ((start + length) <= rb->used_length) {
+        bool need_madvise, need_fallocate;
         uint8_t *host_endaddr = host_startaddr + length;
         if ((uintptr_t)host_endaddr & (rb->page_size - 1)) {
             error_report("ram_block_discard_range: Unaligned end address: %p",
@@ -3711,29 +3712,60 @@ int ram_block_discard_range(RAMBlock *rb, uint64_t start, size_t length)
 
         errno = ENOTSUP; /* If we are missing MADVISE etc */
 
-        if (rb->page_size == qemu_host_page_size) {
-#if defined(CONFIG_MADVISE)
-            /* Note: We need the madvise MADV_DONTNEED behaviour of definitely
-             * freeing the page.
-             */
-            ret = madvise(host_startaddr, length, MADV_DONTNEED);
-#endif
-        } else {
-            /* Huge page case  - unfortunately it can't do DONTNEED, but
-             * it can do the equivalent by FALLOC_FL_PUNCH_HOLE in the
-             * huge page file.
+        /* The logic here is messy;
+         *    madvise DONTNEED fails for hugepages
+         *    fallocate works on hugepages and shmem
+         */
+        need_madvise = (rb->page_size == qemu_host_page_size);
+        need_fallocate = rb->fd != -1;
+        if (need_fallocate) {
+            /* For a file, this causes the area of the file to be zero'd
+             * if read, and for hugetlbfs also causes it to be unmapped
+             * so a userfault will trigger.
              */
 #ifdef CONFIG_FALLOCATE_PUNCH_HOLE
             ret = fallocate(rb->fd, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
                             start, length);
+            if (ret) {
+                ret = -errno;
+                error_report("ram_block_discard_range: Failed to fallocate "
+                             "%s:%" PRIx64 " +%zx (%d)",
+                             rb->idstr, start, length, ret);
+                goto err;
+            }
+#else
+            ret = -ENOSYS;
+            error_report("ram_block_discard_range: fallocate not available/file"
+                         "%s:%" PRIx64 " +%zx (%d)",
+                         rb->idstr, start, length, ret);
+            goto err;
 #endif
         }
-        if (ret) {
-            ret = -errno;
-            error_report("ram_block_discard_range: Failed to discard range "
+        if (need_madvise) {
+            /* For normal RAM this causes it to be unmapped,
+             * for shared memory it causes the local mapping to disappear
+             * and to fall back on the file contents (which we just
+             * fallocate'd away).
+             */
+#if defined(CONFIG_MADVISE)
+            ret =  madvise(host_startaddr, length, MADV_DONTNEED);
+            if (ret) {
+                ret = -errno;
+                error_report("ram_block_discard_range: Failed to discard range "
+                             "%s:%" PRIx64 " +%zx (%d)",
+                             rb->idstr, start, length, ret);
+                goto err;
+            }
+#else
+            ret = -ENOSYS;
+            error_report("ram_block_discard_range: MADVISE not available"
                          "%s:%" PRIx64 " +%zx (%d)",
                          rb->idstr, start, length, ret);
+            goto err;
+#endif
         }
+        trace_ram_block_discard_range(rb->idstr, host_startaddr,
+                                      need_madvise, need_fallocate, ret);
     } else {
         error_report("ram_block_discard_range: Overrun block '%s' (%" PRIu64
                      "/%zx/" RAM_ADDR_FMT")",
diff --git a/trace-events b/trace-events
index ec95e67089..bf9741d930 100644
--- a/trace-events
+++ b/trace-events
@@ -55,9 +55,10 @@ dma_complete(void *dbs, int ret, void *cb) "dbs=%p ret=%d cb=%p"
 dma_blk_cb(void *dbs, int ret) "dbs=%p ret=%d"
 dma_map_wait(void *dbs) "dbs=%p"
 
-#  # exec.c
+# exec.c
 find_ram_offset(uint64_t size, uint64_t offset) "size: 0x%" PRIx64 " @ 0x%" PRIx64
 find_ram_offset_loop(uint64_t size, uint64_t candidate, uint64_t offset, uint64_t next, uint64_t mingap) "trying size: 0x%" PRIx64 " @ 0x%" PRIx64 ", offset: 0x%" PRIx64" next: 0x%" PRIx64 " mingap: 0x%" PRIx64
+ram_block_discard_range(const char *rbname, void *hva, bool need_madvise, bool need_fallocate, int ret) "%s@%p: madvise: %d fallocate: %d ret: %d"
 
 # memory.c
 memory_region_ops_read(int cpu_index, void *mr, uint64_t addr, uint64_t value, unsigned size) "cpu %d mr %p addr 0x%"PRIx64" value 0x%"PRIx64" size %u"
-- 
2.14.3

  reply	other threads:[~2018-02-16 13:16 UTC|newest]

Thread overview: 75+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-16 13:15 [Qemu-devel] [PATCH v3 00/29] postcopy+vhost-user/shared ram Dr. David Alan Gilbert (git)
2018-02-16 13:15 ` Dr. David Alan Gilbert (git) [this message]
2018-02-28  6:37   ` [Qemu-devel] [PATCH v3 01/29] migrate: Update ram_block_discard_range for shared Peter Xu
2018-02-28 19:54     ` Dr. David Alan Gilbert
2018-02-16 13:15 ` [Qemu-devel] [PATCH v3 02/29] qemu_ram_block_host_offset Dr. David Alan Gilbert (git)
2018-02-16 13:15 ` [Qemu-devel] [PATCH v3 03/29] postcopy: use UFFDIO_ZEROPAGE only when available Dr. David Alan Gilbert (git)
2018-02-28  6:53   ` Peter Xu
2018-03-05 17:23     ` Dr. David Alan Gilbert
2018-02-16 13:16 ` [Qemu-devel] [PATCH v3 04/29] postcopy: Add notifier chain Dr. David Alan Gilbert (git)
2018-02-16 13:16 ` [Qemu-devel] [PATCH v3 05/29] postcopy: Add vhost-user flag for postcopy and check it Dr. David Alan Gilbert (git)
2018-02-28  7:14   ` Peter Xu
2018-02-16 13:16 ` [Qemu-devel] [PATCH v3 06/29] vhost-user: Add 'VHOST_USER_POSTCOPY_ADVISE' message Dr. David Alan Gilbert (git)
2018-02-16 13:16 ` [Qemu-devel] [PATCH v3 07/29] libvhost-user: Support sending fds back to qemu Dr. David Alan Gilbert (git)
2018-02-16 13:16 ` [Qemu-devel] [PATCH v3 08/29] libvhost-user: Open userfaultfd Dr. David Alan Gilbert (git)
2018-02-16 13:16 ` [Qemu-devel] [PATCH v3 09/29] postcopy: Allow registering of fd handler Dr. David Alan Gilbert (git)
2018-02-28  8:38   ` Peter Xu
2018-03-05 17:35     ` Dr. David Alan Gilbert
2018-02-16 13:16 ` [Qemu-devel] [PATCH v3 10/29] vhost+postcopy: Register shared ufd with postcopy Dr. David Alan Gilbert (git)
2018-02-28  8:46   ` Peter Xu
2018-03-05 18:21     ` Dr. David Alan Gilbert
2018-02-16 13:16 ` [Qemu-devel] [PATCH v3 11/29] vhost+postcopy: Transmit 'listen' to client Dr. David Alan Gilbert (git)
2018-02-28  8:42   ` Peter Xu
2018-03-05 17:42     ` Dr. David Alan Gilbert
2018-03-06  7:06       ` Peter Xu
2018-03-06 11:20         ` Dr. David Alan Gilbert
2018-03-07 10:05           ` Peter Xu
2018-02-16 13:16 ` [Qemu-devel] [PATCH v3 12/29] postcopy+vhost-user: Split set_mem_table for postcopy Dr. David Alan Gilbert (git)
2018-02-28  8:49   ` Peter Xu
2018-03-05 18:45     ` Dr. David Alan Gilbert
2018-02-16 13:16 ` [Qemu-devel] [PATCH v3 13/29] migration/ram: ramblock_recv_bitmap_test_byte_offset Dr. David Alan Gilbert (git)
2018-02-28  8:52   ` Peter Xu
2018-02-16 13:16 ` [Qemu-devel] [PATCH v3 14/29] libvhost-user+postcopy: Register new regions with the ufd Dr. David Alan Gilbert (git)
2018-02-16 13:16 ` [Qemu-devel] [PATCH v3 15/29] vhost+postcopy: Send address back to qemu Dr. David Alan Gilbert (git)
2018-02-27 14:25   ` Michael S. Tsirkin
2018-02-27 19:54     ` Dr. David Alan Gilbert
2018-02-27 20:25       ` Michael S. Tsirkin
2018-02-28 18:26         ` Dr. David Alan Gilbert
2018-02-16 13:16 ` [Qemu-devel] [PATCH v3 16/29] vhost+postcopy: Stash RAMBlock and offset Dr. David Alan Gilbert (git)
2018-02-16 13:16 ` [Qemu-devel] [PATCH v3 17/29] vhost+postcopy: Send requests to source for shared pages Dr. David Alan Gilbert (git)
2018-02-28 10:03   ` Peter Xu
2018-03-05 18:55     ` Dr. David Alan Gilbert
2018-02-16 13:16 ` [Qemu-devel] [PATCH v3 18/29] vhost+postcopy: Resolve client address Dr. David Alan Gilbert (git)
2018-03-02  7:29   ` Peter Xu
2018-02-16 13:16 ` [Qemu-devel] [PATCH v3 19/29] postcopy: wake shared Dr. David Alan Gilbert (git)
2018-03-02  7:44   ` Peter Xu
2018-03-05 19:35     ` Dr. David Alan Gilbert
2018-03-12 15:44   ` Marc-André Lureau
2018-03-12 16:42     ` Dr. David Alan Gilbert
2018-02-16 13:16 ` [Qemu-devel] [PATCH v3 20/29] postcopy: postcopy_notify_shared_wake Dr. David Alan Gilbert (git)
2018-03-02  7:51   ` Peter Xu
2018-03-05 19:55     ` Dr. David Alan Gilbert
2018-03-06  3:37       ` Peter Xu
2018-03-06 10:54         ` Dr. David Alan Gilbert
2018-03-07 10:13           ` Peter Xu
2018-02-16 13:16 ` [Qemu-devel] [PATCH v3 21/29] vhost+postcopy: Add vhost waker Dr. David Alan Gilbert (git)
2018-03-02  7:55   ` Peter Xu
2018-03-05 20:16     ` Dr. David Alan Gilbert
2018-03-06  7:19       ` Peter Xu
2018-02-16 13:16 ` [Qemu-devel] [PATCH v3 22/29] vhost+postcopy: Call wakeups Dr. David Alan Gilbert (git)
2018-03-02  8:05   ` Peter Xu
2018-03-06 10:36     ` Dr. David Alan Gilbert
2018-03-08  6:22       ` Peter Xu
2018-02-16 13:16 ` [Qemu-devel] [PATCH v3 23/29] libvhost-user: mprotect & madvises for postcopy Dr. David Alan Gilbert (git)
2018-02-16 13:16 ` [Qemu-devel] [PATCH v3 24/29] vhost-user: Add VHOST_USER_POSTCOPY_END message Dr. David Alan Gilbert (git)
2018-02-26 20:27   ` Michael S. Tsirkin
2018-02-27 10:09     ` Dr. David Alan Gilbert
2018-02-16 13:16 ` [Qemu-devel] [PATCH v3 25/29] vhost+postcopy: Wire up POSTCOPY_END notify Dr. David Alan Gilbert (git)
2018-02-16 13:16 ` [Qemu-devel] [PATCH v3 26/29] vhost: Huge page align and merge Dr. David Alan Gilbert (git)
2018-02-16 13:16 ` [Qemu-devel] [PATCH v3 27/29] postcopy: Allow shared memory Dr. David Alan Gilbert (git)
2018-02-16 13:16 ` [Qemu-devel] [PATCH v3 28/29] libvhost-user: Claim support for postcopy Dr. David Alan Gilbert (git)
2018-02-16 13:16 ` [Qemu-devel] [PATCH v3 29/29] postcopy shared docs Dr. David Alan Gilbert (git)
2018-02-27 14:01 ` [Qemu-devel] [PATCH v3 00/29] postcopy+vhost-user/shared ram Michael S. Tsirkin
2018-02-27 20:05   ` Dr. David Alan Gilbert
2018-02-27 20:23     ` Michael S. Tsirkin
2018-02-28 18:38       ` Dr. David Alan Gilbert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180216131625.9639-2-dgilbert@redhat.com \
    --to=dgilbert@redhat.com \
    --cc=aarcange@redhat.com \
    --cc=imammedo@redhat.com \
    --cc=marcandre.lureau@redhat.com \
    --cc=maxime.coquelin@redhat.com \
    --cc=mst@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).