From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43)
	id 1LOzjJ-00042R-N4
	for qemu-devel@nongnu.org; Mon, 19 Jan 2009 14:24:13 -0500
Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43)
	id 1LOzjH-00041c-NI
	for qemu-devel@nongnu.org; Mon, 19 Jan 2009 14:24:13 -0500
Received: from [199.232.76.173] (port=60803 helo=monty-python.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1LOzjH-00041W-Gc
	for qemu-devel@nongnu.org; Mon, 19 Jan 2009 14:24:11 -0500
Received: from qw-out-1920.google.com ([74.125.92.149]:52571)
	by monty-python.gnu.org with esmtp (Exim 4.60)
	(envelope-from <anthony@codemonkey.ws>) id 1LOzjH-0007BP-6e
	for qemu-devel@nongnu.org; Mon, 19 Jan 2009 14:24:11 -0500
Received: by qw-out-1920.google.com with SMTP id 5so541082qwc.4
	for <qemu-devel@nongnu.org>; Mon, 19 Jan 2009 11:24:09 -0800 (PST)
Message-ID: <4974D34D.9070300@codemonkey.ws>
Date: Mon, 19 Jan 2009 13:23:57 -0600
From: Anthony Liguori <anthony@codemonkey.ws>
MIME-Version: 1.0
Subject: Re: [Qemu-devel] [PATCH 1/5] Add target memory mapping API
References: <1232308399-21679-1-git-send-email-avi@redhat.com>	<1232308399-21679-2-git-send-email-avi@redhat.com>	<18804.34053.211615.181730@mariner.uk.xensource.com>	<4974943B.4020507@redhat.com>
	<49749EC5.3080704@codemonkey.ws>
	<18804.45277.255939.338648@mariner.uk.xensource.com>
In-Reply-To: <18804.45277.255939.338648@mariner.uk.xensource.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Reply-To: qemu-devel@nongnu.org
List-Id: qemu-devel.nongnu.org
List-Unsubscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/pipermail/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: qemu-devel@nongnu.org

Ian Jackson wrote:
> Anthony Liguori writes ("Re: [Qemu-devel] [PATCH 1/5] Add target memory mapping API"):
>   
>> The packet IO API is a bit different.  It looks like:
>>     
>
> The purpose here is to be able to make only one system call to the
> host kernel in order to do an operation which involves a scatter
> gather list in guest physical memory (as provided by the guest to eg
> an emulated DMA controller) ?
>
> And the idea is to try to map as much of that contiguously as
> possible so that only one system call need be made ?
>   

No, it's not an issue of performance, it's an issue of correctness.  A 
single readv/writev system call to a tap file descriptor corresponds to 
a single packet.  You cannot split a packet into multiple read/write 
operations to tap IIUC.

> Are there supposed to be fast-path devices where we absolutely must
> make the host system call for the whole transfer in one go, in one
> contiguous memory region ?
>   

It's a matter of correctness.  Packet protocols (datagram protocols if 
you will) must preserve message boundaries.

>> So this is why I prefer the map() API, as it accommodates two distinct 
>> users in a way that the callback API wouldn't.  We can formalize these 
>> idioms into an API, of course.
>>     
>
> I don't think there is any fundamental difference between a callback
> API and a polling API; you can implement whatever semantics you like
> with either.
>
> But callbacks are needed in at least some cases because of the way
> that the bounce buffer may need to be reserved/released.  That means
> all of the callers have to deal with callbacks anyway.
>
> So it makes sense to make that the only code path.  That way callers
> only need to be written once.
>   

If you use callbacks, consider the following:

Scenario 1 (no callbacks)

1) Attempt to map all data in packet
2) Failure occurs b/c bounce buffer is too small to contain packet
3) We can't do zero-copy IO here, so fall back to a copy to a device 
supplied buffer
4) Transmit full packet

Scenario 2 (callbacks)

1) Register callback and begin data mapping
2) Succeed in mapping first part of packet and 1 page of bounced buffer
3) Bounce buffer is exhausted, callback will not be invoked until the 
next unmap
4) Deadlock

You could add some sort of call to determine if the bounce memory is 
exhausted and try to take some corrective action in #2 but it gets ugly.

So as I said earlier, I think the callback model makes sense when you 
have streaming data (where there are no packet boundaries).  I think 
it's pretty difficult to avoid deadlocking though with callbacks in a 
packet model (where boundaries must be preserved).

Regards,

Anthony Liguori

>> BTW, to support this model, we have to reserve at least one bounce 
>> buffer for cpu_physical_memory_rw.
>>     
>
> Yes.
>
> Ian.
>
>
>