From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43)
	id 1NA6ht-0001As-12
	for qemu-devel@nongnu.org; Mon, 16 Nov 2009 13:53:45 -0500
Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43)
	id 1NA6hn-00014e-Jy
	for qemu-devel@nongnu.org; Mon, 16 Nov 2009 13:53:43 -0500
Received: from [199.232.76.173] (port=35784 helo=monty-python.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1NA6hn-00014R-FF
	for qemu-devel@nongnu.org; Mon, 16 Nov 2009 13:53:39 -0500
Received: from mx20.gnu.org ([199.232.41.8]:64501)
	by monty-python.gnu.org with esmtps (TLS-1.0:RSA_AES_256_CBC_SHA1:32)
	(Exim 4.60) (envelope-from <paul@codesourcery.com>)
	id 1NA6hn-0003qx-3k
	for qemu-devel@nongnu.org; Mon, 16 Nov 2009 13:53:39 -0500
Received: from mail.codesourcery.com ([38.113.113.100])
	by mx20.gnu.org with esmtp (Exim 4.60)
	(envelope-from <paul@codesourcery.com>) id 1NA6hl-0005fr-Qp
	for qemu-devel@nongnu.org; Mon, 16 Nov 2009 13:53:38 -0500
From: Paul Brook <paul@codesourcery.com>
Subject: Re: [Qemu-devel] [sneak preview] major scsi overhaul
Date: Mon, 16 Nov 2009 18:53:34 +0000
References: <4AF4ACA5.2090701@redhat.com>
	<200911111638.31288.paul@codesourcery.com>
	<4B017F46.4030700@redhat.com>
In-Reply-To: <4B017F46.4030700@redhat.com>
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="utf-8"
Content-Transfer-Encoding: 7bit
Message-Id: <200911161853.34668.paul@codesourcery.com>
List-Id: qemu-devel.nongnu.org
List-Unsubscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/pipermail/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Gerd Hoffmann <kraxel@redhat.com>
Cc: qemu-devel@nongnu.org

On Monday 16 November 2009, Gerd Hoffmann wrote:
> On 11/11/09 17:38, Paul Brook wrote:
> >>> That cap is important.
> >>> For scsi-generic you probably don't have a choice because of the way
> >>> the kernel interface works.
> >>
> >> Exactly.  And why is the cap important for scsi-disk if scsi-generic
> >> does fine without?
> >
> > With scsi-generic you're at the mercy of what the kernel API gives you,
> > and if the guest hardware/OS isn't cooperative then you loose.
> 
> The guest will loose with unreasonable large requests.  qemu_malloc() ->
> oom() -> abort() -> guest is gone.

Exactly. This lossage is not acceptable.

> We can also limit the amout of host memory we allow the guest to
> consume, so uncooperative guests can't push the host into swap.  This is
> not implemented today, indicating that it hasn't been a problem so far.

Capping the amount of memory required for a transfer *is* implemented, in both 
LSI and virtio-blk.  The exception being SCSI passthrough where the kernel API 
makes it impossible.

>   And with zerocopy it will be even less a problem as we don't need host
> memory to buffer the data ...

zero-copy isn't possible in many cases. You must handle the other cases 
gracefully.

> >> It doesn't.  The disconnect and thus the opportunity to submit more
> >> commands while the device is busy doing the actual I/O is there.
> >
> > Disconnecting on the first DMA request (after switching to a data phase
> > and transferring zero bytes) is bizarre behavior, but probably allowable.
> 
> The new lsi code doesn't.  The old code could do that under certain
> circumstances.  And what is bizarre about that?  A real hard drive will
> most likely do exactly that on reads (unless it has the data cached and
> can start the transfer instantly).

No. The old code goes directly from the command phase to the message 
(disconnect) phase.

> > However by my reading DMA transfers must be performed synchronously by
> > the SCRIPTS engine, so you need to do a lot of extra checking to prove
> > that you can safely continue execution without actually performing the
> > transfer.
> 
> I'll happily add a 'strict' mode which does data transfers synchronously
> in case any compatibility issues show up.
> 
> Such a mode would be slower of course.  We'll have to either do the I/O
> in lots of little chunks or loose zerocopy.  Large transfers + memcpy is
> probably the faster option.

But as you agreed above, large transfers+memcpy is not a realistic option 
because it can have excessive memory requirements.

Paul