From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:54476)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <pbonzini@redhat.com>) id 1Xr2M4-0007R1-3f
	for qemu-devel@nongnu.org; Wed, 19 Nov 2014 05:19:26 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <pbonzini@redhat.com>) id 1Xr2Lx-0004JS-Mt
	for qemu-devel@nongnu.org; Wed, 19 Nov 2014 05:19:20 -0500
Received: from mx1.redhat.com ([209.132.183.28]:45730)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <pbonzini@redhat.com>) id 1Xr2Lx-0004JF-EK
	for qemu-devel@nongnu.org; Wed, 19 Nov 2014 05:19:13 -0500
Message-ID: <546C6E9B.5010801@redhat.com>
Date: Wed, 19 Nov 2014 11:19:07 +0100
From: Paolo Bonzini <pbonzini@redhat.com>
MIME-Version: 1.0
References: <546B781B.3070309@gmail.com> <546B791B.6040908@gmail.com>
	<20141118202805.GC29868@work-vm> <546BBC54.5050900@redhat.com>
	<20141119093516.GA2355@work-vm>
In-Reply-To: <20141119093516.GA2355@work-vm>
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 8bit
Subject: Re: [Qemu-devel] Tunneled Migration with Non-Shared Storage
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Gary R Hook <grhookatwork@gmail.com>, qemu-devel@nongnu.org


On 19/11/2014 10:35, Dr. David Alan Gilbert wrote:
> * Paolo Bonzini (pbonzini@redhat.com) wrote:
>>
>>
>> On 18/11/2014 21:28, Dr. David Alan Gilbert wrote:
>>> This seems odd, since as far as I know the tunneling code is quite separate
>>> to the migration code; I thought the only thing that the migration
>>> code sees different is the file descriptors it gets past.
>>> (Having said that, again I don't know storage stuff, so if this
>>> is a storage special there may be something there...)
>>
>> Tunnelled migration uses the old block-migration.c code.  Non-tunnelled
>> migration uses the NBD server and block/mirror.c. 
> 
> OK, that explains that.  Is that because the tunneling code can't 
> deal with tunneling the NBD server connection?
> 
>> The main problem with
>> the old code is that uses a possibly unbounded amount of memory in
>> mig_save_device_dirty and can have huge jitter if any serious workload
>> is running in the guest.
> 
> So that's sending dirty blocks iteratively? Not that I can see
> when the allocations get freed; but is the amount allocated there
> related to total disk size (as Gary suggested) or to the amount
> of dirty blocks?

It should be related to the maximum rate limit (which can be set to
arbitrarily high values, however).

The reads are started, then the ones that are ready are sent and the
blocks are freed in flush_blks.  The jitter happens when the guest reads
a lot but only writes a few blocks.  In that case, the bdrv_drain_all in
mig_save_device_dirty can be called relatively often and it can be
expensive because it also waits for all guest-initiated reads to complete.

The bulk phase is similar, just with different functions (the reads are
done in mig_save_device_bulk).  With a high rate limit, the total
allocated memory can reach a few gigabytes indeed.

Depending on the scenario, a possible disadvantage of NBD migration is
that it can only throttle each disk separately, while the old code will
apply a single limit to all migrations.

Paolou