qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Yuan Liu <yuan1.liu@intel.com>
To: peterx@redhat.com, farosas@suse.de
Cc: qemu-devel@nongnu.org, hao.xiang@bytedance.com,
	bryan.zhang@bytedance.com, yuan1.liu@intel.com,
	nanhai.zou@intel.com
Subject: [PATCH 0/1] Solve zero page causing multiple page faults
Date: Mon,  1 Apr 2024 23:41:09 +0800	[thread overview]
Message-ID: <20240401154110.2028453-1-yuan1.liu@intel.com> (raw)

1. Description of multiple page faults for received zero pages
    a. -mem-prealloc feature and hugepage backend are not enabled on the
       destination
    b. After receiving the zero pages, the destination first determines if
       the current page content is 0 via buffer_is_zero, this may cause a
       read page fault

      perf record -e page-faults information below
      13.75%  13.75%  multifdrecv_0 qemu-system-x86_64 [.] buffer_zero_avx512
      11.85%  11.85%  multifdrecv_1 qemu-system-x86_64 [.] buffer_zero_avx512
                      multifd_recv_thread
                      nocomp_recv
                      multifd_recv_zero_page_process
                      buffer_is_zero
                      select_accel_fn 
                      buffer_zero_avx512

   c. Other page faults mainly come from writing operations to normal and
      zero pages.

2. Solution
    a. During the multifd migration process, the received pages are tracked
       through RAMBlock's receivedmap.

    b. If received zero page is not set in recvbitmap, the destination will not
       check whether the page content is 0, thus avoiding the occurrence of
       read fault.

    c. If the zero page has been set in receivedmap, set the page with 0
       directly.

    There are two reasons for this
    1. It's unlikely a zero page if it's sent once or more.
    2. For the 1st time destination received a zero page, it must be a zero
       page, so no need to scan for the 1st round.

3. Test Result 16 vCPUs and 64G memory VM,  multifd number is 2,
   and 100G network bandwidth

    3.1 Test case: 16 vCPUs are idle and only 2G memory are used
    +-----------+--------+--------+----------+
    |MultiFD    | total  |downtime|   Page   |
    |Nocomp     | time   |        | Faults   |
    |           | (ms)   | (ms)   |          |
    +-----------+--------+--------+----------+
    |with       |        |        |          |
    |recvbitmap |    7335|     180|      2716|
    +-----------+--------+--------+----------+
    |without    |        |        |          |
    |recvbitmap |    7771|     153|    121357|
    +-----------+--------+--------+----------+
                                                  
    +-----------+--------+--------+--------+-------+--------+-------------+
    |MultiFD    | total  |downtime| SVM    |SVM    | IOTLB  | IO PageFault|
    |QPL        | time   |        | IO TLB |IO Page| MaxTime| MaxTime     |
    |           | (ms)   | (ms)   | Flush  |Faults | (us)   | (us)        |
    +-----------+--------+--------+--------+-------+--------+-------------+
    |with       |        |        |        |       |        |             |
    |recvbitmap |   10224|     175|     410|  27429|       1|          447|
    +-----------+--------+--------+--------+-------+--------+-------------+
    |without    |        |        |        |       |        |             |
    |recvbitmap |   11253|     153|   80756|  38655|      25|        18349|
    +-----------+--------+--------+--------+-------+--------+-------------+


    3.2 Test case: 16 vCPUs are idle and 56G memory(not zero) are used
    +-----------+--------+--------+----------+
    |MultiFD    | total  |downtime|   Page   |
    |Nocomp     | time   |        | Faults   |
    |           | (ms)   | (ms)   |          |
    +-----------+--------+--------+----------+
    |with       |        |        |          |
    |recvbitmap |   16825|     165|     52967|
    +-----------+--------+--------+----------+
    |without    |        |        |          |
    |recvbitmap |   12987|     159|   2672677|
    +-----------+--------+--------+----------+

    +-----------+--------+--------+--------+-------+--------+-------------+
    |MultiFD    | total  |downtime| SVM    |SVM    | IOTLB  | IO PageFault|
    |QPL        | time   |        | IO TLB |IO Page| MaxTime| MaxTime     |
    |           | (ms)   | (ms)   | Flush  |Faults | (us)   | (us)        |
    +-----------+--------+--------+--------+-------+--------+-------------+
    |with       |        |        |        |       |        |             |
    |recvbitmap |  132315|      77|     890| 937105|      60|         9581|
    +-----------+--------+--------+--------+-------+--------+-------------+
    |without    |        |        |        |       |        |             |
    |recvbitmap | >138333|     N/A| 1647701| 981899|      43|        21018|
    +-----------+--------+--------+--------+-------+--------+-------------+


From the test result, both of page faults and IOTLB Flush operations can
be significantly reduced. The reason is that zero page processing does not
trigger read faults, and a large number of zero pages do not even trigger
write faults (Test 3.1), because it is considered that after the destination
is started, the content of unaccessed pages is 0.

I have a concern here, the RAM memory is allocated by mmap with anonymous
flag, and if the first received zero page is not set to 0 explicitly, does
this ensure that the received zero pages memory data is 0?

In this case, the performance impact of live migration is not big
because the destination is not the bottleneck.

When using QPL (SVM-capable device), even if IOTLB is improved, the
overall performance will still be seriously degraded because a large
number of IO page faults are still generated.

Previous discussion link:
1. https://lore.kernel.org/all/CAAYibXib+TWnJpV22E=adncdBmwXJRqgRjJXK7X71J=bDfaxDg@mail.gmail.com/
2. https://lore.kernel.org/all/PH7PR11MB594123F7EEFEBFCE219AF100A33A2@PH7PR11MB5941.namprd11.prod.outlook.com/

Yuan Liu (1):
  migration/multifd: solve zero page causing multiple page faults

 migration/multifd-zero-page.c | 4 +++-
 migration/multifd-zlib.c      | 1 +
 migration/multifd-zstd.c      | 1 +
 migration/multifd.c           | 1 +
 migration/ram.c               | 4 ++++
 migration/ram.h               | 1 +
 6 files changed, 11 insertions(+), 1 deletion(-)

-- 
2.39.3



             reply	other threads:[~2024-04-02  7:29 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-01 15:41 Yuan Liu [this message]
2024-04-01 15:41 ` [PATCH 1/1] migration/multifd: solve zero page causing multiple page faults Yuan Liu
2024-04-02 12:57   ` Fabiano Rosas
2024-04-03 19:41     ` Peter Xu
2024-04-02  7:43 ` [PATCH 0/1] Solve " Liu, Yuan1

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240401154110.2028453-1-yuan1.liu@intel.com \
    --to=yuan1.liu@intel.com \
    --cc=bryan.zhang@bytedance.com \
    --cc=farosas@suse.de \
    --cc=hao.xiang@bytedance.com \
    --cc=nanhai.zou@intel.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).