Re: [Qemu-devel] RDMA: please pull and re-test freezing fixes

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: "Michael R. Hines" <mrhines@linux.vnet.ibm.com>
To: "Michael R. Hines" <mrhines@linux.vnet.ibm.com>
Cc: "mrhines@us.ibm.com" <mrhines@us.ibm.com>,
	Bulent Abali <abali@us.ibm.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	Anthony Liguori <anthony@codemonkey.ws>,
	Juan Jose Quintela Carreira <quintela@redhat.com>
Subject: Re: [Qemu-devel] RDMA: please pull and re-test freezing fixes
Date: Sun, 16 Jun 2013 00:13:46 -0400	[thread overview]
Message-ID: <51BD3B7A.4050504@linux.vnet.ibm.com> (raw)
In-Reply-To: <51BB7F62.9000304@linux.vnet.ibm.com>

[-- Attachment #1: Type: text/plain, Size: 4734 bytes --]

These are great results. Even the one that doesn't converge is good,
because it shows that your patch for CPU throttling earlier on the
mailing list is still very necessary.

Thanks a lot, I will send out a V10 patch to everybody's results.

- Michael

Would you be interested in me posting them to the wiki?

On 6/14/2013 1:38 PM, Michael R. Hines wrote:
> Chegu,
>
> I sent a V9 to the mailing list:
>
> The version goes even further, by explicitly timing the pinning
> latency and
> pushing the value out to QMP so the user clearly knows which component
> of total migration time is consumed by pinning.
>
> If you're satisfied, I'd appreciate if I could add your Reviewed-By: =)
 >


Pl. see below...and yes you can add me.
Thanks,
Vinod



The migration speed was set to 40G and the downtime to 2sec for all 
experiments
below.

Note: Idle guests are not interesting due to tons of zero pages etc...but
including them here to highlght the overhead of pinning.

1) 20vcpu/64GB guest: (kind of a larger sized Cloud-type guest)

a) Idle guest with No pinning (default) :

capabilities: xbzrle: off x-rdma-pin-all: off
Migration status: completed
total time: 51062 milliseconds
downtime: 1948 milliseconds
pin-all: 0 milliseconds
transferred ram: 1816547 kbytes
throughput: 6872.23 mbps
remaining ram: 0 kbytes
total ram: 67117632 kbytes
duplicate: 16331552 pages
skipped: 0 pages
normal: 450038 pages
normal bytes: 1800152 kbytes


b) Idle guest with Pinning :

capabilities: xbzrle: off x-rdma-pin-all: on
Migration status: completed
total time: 47451 milliseconds
downtime: 2639 milliseconds
pin-all: 22780 milliseconds
transferred ram: 67136643 kbytes
throughput: 25222.91 mbps
remaining ram: 0 kbytes
total ram: 67117632 kbytes
duplicate: 0 pages
skipped: 0 pages
normal: 16780064 pages
normal bytes: 67120256 kbytes

There weere no freezes observed in the guest at the start of the migration
but the qemu monitor prompt was not responsive for the duration of the
memory pinning.

Total migration time was affected by the cost pinning at the start of the
migration as shown above( This issue can be pursued and optimized later).

c) Pining + guest running a Java warehouse workload (I cranked the 
workload up
                                          to keep the guest 95+% busy)

capabilities: xbzrle: off x-rdma-pin-all: on
Migration status: active
total time: 412706 milliseconds
expected downtime: 499 milliseconds
pin-all: 22758 milliseconds
transferred ram: 657243669 kbytes
throughput: 25241.89 mbps
remaining ram: 7281848 kbytes
total ram: 67117632 kbytes
duplicate: 0 pages
skipped: 0 pages
normal: 164270810 pages
normal bytes: 657083240 kbytes
dirty pages rate: 369925 pages

No Convergence ! (For workloads where the memory dirty rate is very high
there are other alternatives that have been discussed in the past...)

---

Enterprise type guests tend to get fatter (more memory per cpu) than the
larger Cloud  guests...so here are a coupld of them.


a) 20VCPU/256G Idle guest :

Default:

capabilities: xbzrle: off x-rdma-pin-all: off
Migration status: completed
total time: 259259 milliseconds
downtime: 3924 milliseconds
pin-all: 0 milliseconds
transferred ram: 5522078 kbytes
throughput: 6586.06 mbps
remaining ram: 0 kbytes
total ram: 268444224 kbytes
duplicate: 65755168 pages
skipped: 0 pages
normal: 1364124 pages
normal bytes: 5456496 kbytes


Pinned:

capabilities: xbzrle: off x-rdma-pin-all: on
Migration status: completed
total time: 219053 milliseconds
downtime: 4277 milliseconds
pin-all: 118153 milliseconds
transferred ram: 268512809 kbytes
throughput: 22209.32 mbps
remaining ram: 0 kbytes
total ram: 268444224 kbytes
duplicate: 0 pages
skipped: 0 pages
normal: 67111817 pages
normal bytes: 268447268 kbytes


b) 40VCPU/512GB Idle guest :


Default:

capabilities: xbzrle: off x-rdma-pin-all: off
Migration status: completed
total time: 670577 milliseconds
downtime: 6139 milliseconds
pin-all: 0 milliseconds
transferred ram: 10279256 kbytes
throughput: 6150.93 mbps
remaining ram: 0 kbytes
total ram: 536879680 kbytes
duplicate: 131704099 pages
skipped: 0 pages
normal: 2537017 pages
normal bytes: 10148068 kbytes

Pinned:

capabilities: xbzrle: off x-rdma-pin-all: on
Migration status: completed
total time: 527576 milliseconds
downtime: 6314 milliseconds
pin-all: 312984 milliseconds
transferred ram: 537129685 kbytes
throughput: 20177.27 mbps
remaining ram: 0 kbytes
total ram: 536879680 kbytes
duplicate: 0 pages
skipped: 0 pages
normal: 134249644 pages
normal bytes: 536998576 kbytes

No freezes in the guest due to memory pinning. (Freezes were only due to 
the
dirty bitmap synchup stuff which is being done while BQL is held. Juan is
working on addresing already for qemu 1.6)











[-- Attachment #2: 01 --]
[-- Type: text/plain, Size: 3996 bytes --]


The migration speed was set to 40G and the downtime to 2sec for all experiments
below. 

Note: Idle guests are not interesting due to tons of zero pages etc...but
including them here to highlght the overhead of pinning.

1) 20vcpu/64GB guest: (kind of a larger sized Cloud-type guest)

a) Idle guest with No pinning (default) :

capabilities: xbzrle: off x-rdma-pin-all: off 
Migration status: completed
total time: 51062 milliseconds
downtime: 1948 milliseconds
pin-all: 0 milliseconds
transferred ram: 1816547 kbytes
throughput: 6872.23 mbps
remaining ram: 0 kbytes
total ram: 67117632 kbytes
duplicate: 16331552 pages
skipped: 0 pages
normal: 450038 pages
normal bytes: 1800152 kbytes


b) Idle guest with Pinning :

capabilities: xbzrle: off x-rdma-pin-all: on 
Migration status: completed
total time: 47451 milliseconds
downtime: 2639 milliseconds
pin-all: 22780 milliseconds
transferred ram: 67136643 kbytes
throughput: 25222.91 mbps
remaining ram: 0 kbytes
total ram: 67117632 kbytes
duplicate: 0 pages
skipped: 0 pages
normal: 16780064 pages
normal bytes: 67120256 kbytes

There weere no freezes observed in the guest at the start of the migration
but the qemu monitor prompt was not responsive for the duration of the 
memory pinning.

Total migration time was affected by the cost pinning at the start of the 
migration as shown above( This issue can be pursued and optimized later).

c) Pining + guest running a Java warehouse workload (I cranked the workload up
                                         to keep the guest 95+% busy)

capabilities: xbzrle: off x-rdma-pin-all: on 
Migration status: active
total time: 412706 milliseconds
expected downtime: 499 milliseconds
pin-all: 22758 milliseconds
transferred ram: 657243669 kbytes
throughput: 25241.89 mbps
remaining ram: 7281848 kbytes
total ram: 67117632 kbytes
duplicate: 0 pages
skipped: 0 pages
normal: 164270810 pages
normal bytes: 657083240 kbytes
dirty pages rate: 369925 pages

No Convergence ! (For workloads where the memory dirty rate is very high
there are other alternatives that have been discussed in the past...)

---

Enterprise type guests tend to get fatter (more memory per cpu) than the 
larger Cloud  guests...so here are a coupld of them.


a) 20VCPU/256G Idle guest :

Default:

capabilities: xbzrle: off x-rdma-pin-all: off 
Migration status: completed
total time: 259259 milliseconds
downtime: 3924 milliseconds
pin-all: 0 milliseconds
transferred ram: 5522078 kbytes
throughput: 6586.06 mbps
remaining ram: 0 kbytes
total ram: 268444224 kbytes
duplicate: 65755168 pages
skipped: 0 pages
normal: 1364124 pages
normal bytes: 5456496 kbytes


Pinned:

capabilities: xbzrle: off x-rdma-pin-all: on 
Migration status: completed
total time: 219053 milliseconds
downtime: 4277 milliseconds
pin-all: 118153 milliseconds
transferred ram: 268512809 kbytes
throughput: 22209.32 mbps
remaining ram: 0 kbytes
total ram: 268444224 kbytes
duplicate: 0 pages
skipped: 0 pages
normal: 67111817 pages
normal bytes: 268447268 kbytes


b) 40VCPU/512GB Idle guest :


Default:

capabilities: xbzrle: off x-rdma-pin-all: off 
Migration status: completed
total time: 670577 milliseconds
downtime: 6139 milliseconds
pin-all: 0 milliseconds
transferred ram: 10279256 kbytes
throughput: 6150.93 mbps
remaining ram: 0 kbytes
total ram: 536879680 kbytes
duplicate: 131704099 pages
skipped: 0 pages
normal: 2537017 pages
normal bytes: 10148068 kbytes

Pinned:

capabilities: xbzrle: off x-rdma-pin-all: on 
Migration status: completed
total time: 527576 milliseconds
downtime: 6314 milliseconds
pin-all: 312984 milliseconds      
transferred ram: 537129685 kbytes
throughput: 20177.27 mbps
remaining ram: 0 kbytes
total ram: 536879680 kbytes
duplicate: 0 pages
skipped: 0 pages
normal: 134249644 pages
normal bytes: 536998576 kbytes

No freezes in the guest due to memory pinning. (Freezes were only due to the 
dirty bitmap synchup stuff which is being done while BQL is held. Juan is 
working on addresing already for qemu 1.6)

     prev parent reply	other threads:[~2013-06-16  4:15 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-10 16:03 [Qemu-devel] [PATCH v7 00/12] rdma: migration support mrhines
2013-06-10 16:03 ` [Qemu-devel] [PATCH v7 01/12] rdma: add documentation mrhines
2013-06-10 16:03 ` [Qemu-devel] [PATCH v7 02/12] rdma: introduce qemu_update_position() mrhines
2013-06-10 16:03 ` [Qemu-devel] [PATCH v7 03/12] rdma: export yield_until_fd_readable() mrhines
2013-06-10 16:03 ` [Qemu-devel] [PATCH v7 04/12] rdma: export throughput w/ MigrationStats QMP mrhines
2013-06-10 16:03 ` [Qemu-devel] [PATCH v7 05/12] rdma: introduce qemu_file_mode_is_not_valid() mrhines
2013-06-10 16:03 ` [Qemu-devel] [PATCH v7 06/12] rdma: export qemu_fflush() mrhines
2013-06-10 16:03 ` [Qemu-devel] [PATCH v7 07/12] rdma: introduce ram_handle_compressed() mrhines
2013-06-10 16:03 ` [Qemu-devel] [PATCH v7 08/12] rdma: introduce qemu_ram_foreach_block() mrhines
2013-06-10 16:03 ` [Qemu-devel] [PATCH v7 09/12] rdma: new QEMUFileOps hooks mrhines
2013-06-10 16:03 ` [Qemu-devel] [PATCH v7 10/12] rdma: introduce capability x-rdma-pin-all mrhines
2013-06-10 16:03 ` [Qemu-devel] [PATCH v7 11/12] rdma: core logic mrhines
2013-06-10 16:03 ` [Qemu-devel] [PATCH v7 12/12] rdma: send pc.ram mrhines
     [not found] ` <4168C988EBDF2141B4E0B6475B6A73D10CE2AAC1@G6W2488.americas.hpqcorp.net>
     [not found]   ` <51B60ABA.2070401@linux.vnet.ibm.com>
     [not found]     ` <4168C988EBDF2141B4E0B6475B6A73D10CE2BAE7@G6W2488.americas.hpqcorp.net>
     [not found]       ` <51B7B652.3070905@linux.vnet.ibm.com>
     [not found]         ` <51B85EE5.1050702@hp.com>
     [not found]           ` <51B868B3.9090607@linux.vnet.ibm.com>
     [not found]             ` <51B9A614.2050101@hp.com>
     [not found]               ` <51B9BFCA.4050008@linux.vnet.ibm.com>
     [not found]                 ` <51B9CE2E.5080504@hp.com>
2013-06-13 14:45                   ` [Qemu-devel] [PATCH v7 00/12] rdma: migration support Michael R. Hines
     [not found]               ` <51B9C2D6.30000@linux.vnet.ibm.com>
     [not found]                 ` <51B9D6A8.9070007@hp.com>
2013-06-13 14:55                   ` Michael R. Hines
2013-06-13 20:06                     ` Paolo Bonzini
2013-06-13 21:17                       ` Michael R. Hines
2013-06-13 21:40                         ` Paolo Bonzini
2013-06-14  6:25                           ` Michael R. Hines
2013-06-14  6:30                         ` [Qemu-devel] RDMA: please pull and re-test freezing fixes Michael R. Hines
2013-06-14 20:38                           ` Michael R. Hines
2013-06-15 22:50                             ` Chegu Vinod
2013-06-16  4:13                             ` Michael R. Hines [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51BD3B7A.4050504@linux.vnet.ibm.com \
    --to=mrhines@linux.vnet.ibm.com \
    --cc=abali@us.ibm.com \
    --cc=anthony@codemonkey.ws \
    --cc=mrhines@us.ibm.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).