From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([208.118.235.92]:45246)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <pbonzini@redhat.com>) id 1UHwrl-0000oZ-3C
	for qemu-devel@nongnu.org; Tue, 19 Mar 2013 09:46:18 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <pbonzini@redhat.com>) id 1UHwrh-0007Yi-RY
	for qemu-devel@nongnu.org; Tue, 19 Mar 2013 09:46:13 -0400
Received: from mx1.redhat.com ([209.132.183.28]:56297)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <pbonzini@redhat.com>) id 1UHwrh-0007Yb-JB
	for qemu-devel@nongnu.org; Tue, 19 Mar 2013 09:46:09 -0400
Message-ID: <51486C0D.2040609@redhat.com>
Date: Tue, 19 Mar 2013 14:45:49 +0100
From: Paolo Bonzini <pbonzini@redhat.com>
MIME-Version: 1.0
References: <1363576743-6146-1-git-send-email-mrhines@linux.vnet.ibm.com>
	<1363576743-6146-9-git-send-email-mrhines@linux.vnet.ibm.com>
	<5146D9BF.3030407@redhat.com> <51477A26.8090600@linux.vnet.ibm.com>
	<51482D78.3010301@redhat.com> <5148643F.2070401@linux.vnet.ibm.com>
	<51486733.7060207@redhat.com> <51486AD0.80309@linux.vnet.ibm.com>
In-Reply-To: <51486AD0.80309@linux.vnet.ibm.com>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] [RFC PATCH RDMA support v4: 08/10] introduce
	QEMUFileRDMA
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: "Michael R. Hines" <mrhines@linux.vnet.ibm.com>
Cc: aliguori@us.ibm.com, mst@redhat.com, qemu-devel@nongnu.org, owasserm@redhat.com, abali@us.ibm.com, mrhines@us.ibm.com, gokul@us.ibm.com

Il 19/03/2013 14:40, Michael R. Hines ha scritto:
> On 03/19/2013 09:25 AM, Paolo Bonzini wrote:
>> Yes, the drain needs to happen in a few places already:
>>
>> 1. During save_rdma_page (if the current "chunk" is full of pages)
>> Ok, this is internal to RDMA so no problem.
>>
>>> 2. During the end of each iteration (now using qemu_fflush in my current
>>> patch)
>> Why?
> 
> This is because of downtime: You have to drain the queue anyway at the
> very end, and if you don't drain it in advance after each iteration, then
> the queue will have lots of bytes in it waiting for transmission and the
> Virtual Machine will be stopped for a much longer period of time during
> the last iteration waiting for RDMA card to finish transmission of all
> those
> bytes.

Shouldn't the "current chunk full" case take care of it too?

Of course if you disable chunking you have to add a different condition,
perhaps directly into save_rdma_page.

> If you wait till the last iteration to do this, then all of that waiting time gets
> counted as downtime, causing the VCPUs to be unnecessarily stopped.
> 
>>> 3. And also during qemu_savem_state_complete(), also using qemu_fflush.
>> This would be caught by put_buffer, but (2) would not.
>>
> 
> I'm not sure this is good enough either - we don't want to flush
> the queue *frequently*..... only when it's necessary for performance
> .... we do want the queue to have some meat to it so the hardware
> can write bytes as fast as possible.....
> 
> If we flush inside put_buffer (which is called very frequently):

Is it called at any time during RAM migration?

> then we have no way to distinquish *where* put buffer was called from
> (either from qemu_savevm_state_complete() or from a device-level
> function call that's using QEMUFile).

Can you make drain a no-op if there is nothing in flight?  Then every
call to put_buffer after the first should not have any overhead.

Paolo

>>>> Yes, I am suggesting to modify f->pos in save_rdma_page instead.
>>>>
>>>> Paolo
>>>>
>>> Would that not confuse the other QEMUFile users?
>>> If I change that pointer (without actually putting bytes
>>> in into QEMUFile), won't the f->pos pointer be
>>> incorrectly updated?
>> f->pos is never used directly by QEMUFile, it is almost an opaque value.
>>   It is accumulated on every qemu_fflush (so that it can be passed to the
>> ->put_buffer function), and returned by qemu_ftell; nothing else.
>>
>> If you make somehow save_rdma_page a new op, returning a value from that
>> op and adding it to f->pos would be a good way to achieve this.
> 
> Ok, great - I'll take advantage of that........Thanks.
>