[Qemu-devel] Fwd: Re: Tunneled Migration with Non-Shared Storage

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Gary R Hook <grhookatwork@gmail.com>
To: qemu-devel@nongnu.org
Subject: [Qemu-devel] Fwd: Re: Tunneled Migration with Non-Shared Storage
Date: Wed, 19 Nov 2014 14:12:24 -0600	[thread overview]
Message-ID: <546CF9A8.6070104@gmail.com> (raw)
In-Reply-To: <546CE8EC.9090908@gmail.com>

Ugh, I wish I could teach Thunderbird to understand how to reply to a 
newsgroup.

Apologies to Paolo for the direct note.

On 11/19/14 4:19 AM, Paolo Bonzini wrote:
>
>
> On 19/11/2014 10:35, Dr. David Alan Gilbert wrote:
>> * Paolo Bonzini (pbonzini@redhat.com) wrote:
>>>
>>>
>>> On 18/11/2014 21:28, Dr. David Alan Gilbert wrote:
>>>> This seems odd, since as far as I know the tunneling code is quite separate
>>>> to the migration code; I thought the only thing that the migration
>>>> code sees different is the file descriptors it gets past.
>>>> (Having said that, again I don't know storage stuff, so if this
>>>> is a storage special there may be something there...)
>>>
>>> Tunnelled migration uses the old block-migration.c code.  Non-tunnelled
>>> migration uses the NBD server and block/mirror.c.
>>
>> OK, that explains that.  Is that because the tunneling code can't
>> deal with tunneling the NBD server connection?
>>
>>> The main problem with
>>> the old code is that uses a possibly unbounded amount of memory in
>>> mig_save_device_dirty and can have huge jitter if any serious workload
>>> is running in the guest.
>>
>> So that's sending dirty blocks iteratively? Not that I can see
>> when the allocations get freed; but is the amount allocated there
>> related to total disk size (as Gary suggested) or to the amount
>> of dirty blocks?
>
> It should be related to the maximum rate limit (which can be set to
> arbitrarily high values, however).

This makes no sense. The code in block_save_iterate() specifically
attempts to control the rate of transfer. But when
qemu_file_get_rate_limit() returns a number like 922337203685372723
(0xCCCCCCCCCCB3333) I'm under the impression that no bandwidth
constraints are being imposed at this layer. Why, then, would that
transfer be occurring at 20MB/sec (simple, under-utilized 1 gigE
connection) with no clear bottleneck in CPU or network? What other
relation might exist?

> The reads are started, then the ones that are ready are sent and the
> blocks are freed in flush_blks.  The jitter happens when the guest reads
> a lot but only writes a few blocks.  In that case, the bdrv_drain_all in
> mig_save_device_dirty can be called relatively often and it can be
> expensive because it also waits for all guest-initiated reads to complete.

Pardon my ignorance, but this does not match my observations. What I am
seeing is the process size of the source qemu grow steadily until the
COR completes; during this time the backing file on the destination
system does not change/grow at all, which implies that no blocks are
being transferred. (I have tested this with a 25GB VM disk, and larger;
no network activity occurs during this period.) Once the COR is done and
the in-memory copy ready (marked by a "Completed 100%" message from
blk_mig_save_builked_block()) the transfer begins. At an abysmally slow
rate, I'll add, per the above. Another problem to be investigated.

> The bulk phase is similar, just with different functions (the reads are
> done in mig_save_device_bulk).  With a high rate limit, the total
> allocated memory can reach a few gigabytes indeed.

Much, much more than that. It's definitely dependent upon the disk file
size. Tiny VM disks are a nit; big VM disks are a problem.

> Depending on the scenario, a possible disadvantage of NBD migration is
> that it can only throttle each disk separately, while the old code will
> apply a single limit to all migrations.

How about no throttling at all? And just to be very clear, the goal is
fast (NBD-based) migrations of VMs using non-shared storage over an
encrypted channel. Safest, worst-case scenario. Aside from gaining an
understanding of this code.

Thank you for your attention.

-- 
Gary R Hook
Senior Kernel Engineer
NIMBOXX, Inc

next      parent reply	other threads:[~2014-11-19 20:12 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <546CE8EC.9090908@gmail.com>
2014-11-19 20:12 ` Gary R Hook [this message]
2014-11-20  9:54   ` [Qemu-devel] Fwd: Re: Tunneled Migration with Non-Shared Storage Dr. David Alan Gilbert
2014-11-20 17:04     ` Gary R Hook

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=546CF9A8.6070104@gmail.com \
    --to=grhookatwork@gmail.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).