From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43)
	id 1GlUC0-0006RV-3Z
	for qemu-devel@nongnu.org; Sat, 18 Nov 2006 12:41:28 -0500
Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43)
	id 1GlUBz-0006QR-3G
	for qemu-devel@nongnu.org; Sat, 18 Nov 2006 12:41:27 -0500
Received: from [199.232.76.173] (helo=monty-python.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1GlUBy-0006Pv-Je
	for qemu-devel@nongnu.org; Sat, 18 Nov 2006 12:41:26 -0500
Received: from [84.96.92.11] (helo=smtp.Neuf.fr)
	by monty-python.gnu.org with esmtp (Exim 4.52) id 1GlUBy-0000ZK-Gg
	for qemu-devel@nongnu.org; Sat, 18 Nov 2006 12:41:26 -0500
Received: from [84.102.211.76] by sp604005mt.gpm.neuf.ld
	(Sun Java System Messaging Server 6.2-5.05 (built Feb 16 2006))
	with ESMTP id <0J8X00DZ8T203N70@sp604005mt.gpm.neuf.ld> for
	qemu-devel@nongnu.org; Sat, 18 Nov 2006 18:39:36 +0100 (CET)
Date: Sat, 18 Nov 2006 18:40:51 +0100
From: Fabrice Bellard <fabrice@bellard.org>
Subject: Re: [Qemu-devel] [PATCH] Experimental initial patch providing
	accelerated OpenGL for Linux i386 (2nd attempt to post)
In-reply-to: <200611181359.25068.even.rouault@mines-paris.org>
Message-id: <455F45A3.4010806@bellard.org>
MIME-version: 1.0
Content-type: text/plain; charset=ISO-8859-1; format=flowed
Content-transfer-encoding: QUOTED-PRINTABLE
References: <200611162134.48783.even.rouault@mines-paris.org>
	<455CE92F.2000100@bellard.org>
	<200611181359.25068.even.rouault@mines-paris.org>
Reply-To: qemu-devel@nongnu.org
List-Id: qemu-devel.nongnu.org
List-Unsubscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/pipermail/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Even Rouault <even.rouault@mines-paris.org>
Cc: qemu-devel@nongnu.org

Even Rouault wrote:
> Le Jeudi 16 Novembre 2006 23:41, vous avez =E9crit :
>=20
>>My main remark is that the host/guest communication system must be
>>changed and I can help you to implement it. I would prefer to use a=
 PCI
>>device and to avoid any i386 dependent code. For the PCI device, us=
ing
>>the Bochs VGA adapter could be a possible idea. All the parameters =
and
>>data should be transmitted as if the PCI device was doing DMA. A si=
ngle
>>I/O port could be used to start executing a list of OpenGL commands=
.
>=20
>=20
> Hi,
>=20
> I would indeed appreciate help, or at least some pointers to start =
in the=20
> direction you propose, as I know hardly anything about hardware pro=
gramming,=20
> such as PCI, memory mapped region, DMA, etc... So my questions may =
sound very=20
> naive.
> As you stated, the current solution is i386 dependent, but this dep=
endancy is=20
> very thin, so I imagined that it should possible to find equivalent=
 of the=20
> current int 0x99 trap for other architectures.
> Apart from portability to other architectures, what would be the ot=
her=20
> advantages of a solution based on a PCI device ? Better security ? =
Better=20
> performance when KQEMU is enabled ?

The PCI device is not necessarily an advantage is terms of performanc=
e=20
and it will be more complicated to implement on both the host and the=
=20
guest. But it is better in terms of security and it avoids adding=
=20
unnecessary hacks in the CPU core (for example, I consider the use of=
=20
virtual addresses as a hack).

> I've looked at vga.c and I've the feeling that with=20
> cpu_register_io_memory/cpu_register_physical_memory  you can instal=
l callback=20
> functions that will intercept reads/writes to a range of the physic=
al memory=20
> of the target machine. Am I right ?=20

Yes.

> But I don't see how the replacement libGL can read/write physical m=
emory from=20
> a userland process. I suppose it needs some special priviledges to =
use for=20
> example a ioctl, or maybe writing a kernel module. So it would beco=
me guest=20
> OS dependant. Furthermore, doesn't this solution imply more memcpy =
that may=20
> affect performance ? Currently, if a memory range of a guest proces=
s (let's=20
> say a texture) is by chance mapped contiguously into guest physical=
 memory,=20
> we don't need to do any copy before passing it to the host libGL, t=
hough I've=20
> not benchmarked if it really improves performance.

You have no choice but adding a kernel module to handle the transfers=
 to=20
and from the PCI device. Basically you must write a small XFree DRM l=
ike=20
kernel driver. Since the PCI device will only handle physical memory,=
=20
the kernel driver will convert the virtual addresses to physical=20
addresses and ensure that the corresponding pages are not swapped out=
 by=20
the guest OS. The PCI device must handle lists of physical I/O region=
s=20
so that no memcpy will be needed to do the transfers (scatter/gather=
=20
DMA). The performance should be the same as your current implementati=
on.

Moreover, your protocol could handle queueing of several OpenGL comma=
nds=20
in a FIFO because "int 0x99" or the equivalent PCI write command take=
s=20
some time to execute, especially when using kqemu where an exception =
is=20
raised.

Another point is that I fear that your current use of glX is not=20
portable and can lead to subtle problems. You should rely on SDL/Open=
GL=20
on the host side and leave glX on the guest OS.

All in all, what I propose gets very close to adding something like a=
=20
real 3d VGA device in QEMU and a new 3d driver in X11 !

[My intend before your submission was to emulate a recent Intel 3d ca=
rd=20
because their protocols are mostly documented now (at least in the X1=
1=20
source !) and because these recent cards support higher level 3d=20
operations such as hardware 3d transformations. My guess is that=20
converting their DMA commands to OpenGL is easy.

As writing an Intel 3d emulation would take time and is likely to be=
=20
very complicated to tune for closed source guest OSes, I think it is=
=20
safer to begin by improving your proposal.]

Regards,

Fabrice.