From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:46202)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <pbonzini@redhat.com>) id 1ZC58K-0006U4-DX
	for qemu-devel@nongnu.org; Mon, 06 Jul 2015 08:04:28 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <pbonzini@redhat.com>) id 1ZC58E-0004zl-83
	for qemu-devel@nongnu.org; Mon, 06 Jul 2015 08:04:24 -0400
Received: from mx1.redhat.com ([209.132.183.28]:58213)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <pbonzini@redhat.com>) id 1ZC58D-0004ys-Vo
	for qemu-devel@nongnu.org; Mon, 06 Jul 2015 08:04:18 -0400
References: <20150702205556-mutt-send-email-mst@redhat.com>
	<55958AE2.1020600@redhat.com>
	<20150704230256-mutt-send-email-mst@redhat.com>
	<559A3246.7020103@redhat.com>
	<20150706105048-mutt-send-email-mst@redhat.com>
	<559A4067.3060109@redhat.com>
	<20150706120539-mutt-send-email-mst@redhat.com>
	<CAFEAcA8XOicFoWSN7_aQjx7754XODaBpXLoC1bpEL+M=1wmoSQ@mail.gmail.com>
	<20150706125811-mutt-send-email-mst@redhat.com>
	<CAFEAcA-EdJKgOHzKRr9e1XZu05FHRarZXC-gpAvUCnAL98FQXA@mail.gmail.com>
	<20150706132538-mutt-send-email-mst@redhat.com>
	<CAFEAcA-3KVV5_FTgLvijYXHTb1cZ2QR1hd8=WqLyVV_a9mEO_g@mail.gmail.com>
From: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <559A6EBC.4010004@redhat.com>
Date: Mon, 6 Jul 2015 14:04:12 +0200
MIME-Version: 1.0
In-Reply-To: <CAFEAcA-3KVV5_FTgLvijYXHTb1cZ2QR1hd8=WqLyVV_a9mEO_g@mail.gmail.com>
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
Subject: Re: [Qemu-devel] [PATCH] virtio-pci: implement cfg capability
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Peter Maydell <peter.maydell@linaro.org>, "Michael S. Tsirkin" <mst@redhat.com>
Cc: =?UTF-8?Q?Herv=c3=a9_Poussineau?= <hpoussin@reactos.org>, QEMU Developers <qemu-devel@nongnu.org>


On 06/07/2015 13:50, Peter Maydell wrote:
> On 6 July 2015 at 11:31, Michael S. Tsirkin <mst@redhat.com> wrote:
>> On Mon, Jul 06, 2015 at 11:04:24AM +0100, Peter Maydell wrote:
>>> On 6 July 2015 at 11:03, Michael S. Tsirkin <mst@redhat.com> wrote:
>>>> On Mon, Jul 06, 2015 at 10:11:18AM +0100, Peter Maydell wrote:
>>>>> But address_space_rw() is just the "memcpy bytes to the
>>>>> target's memory" operation -- if you have a pile of bytes
>>>>> then there are no endianness concerns. If you don't have
>>>>> a pile of bytes then you need to know the structure of
>>>>> the data you're DMAing around, and you should probably
>>>>> have a loop doing things with the specify-the-width functions.
>>>
>>>> Absolutely. But what if DMA happens to target another device
>>>> and not memory? Device needs some endian-ness so it needs
>>>> to be converted to that.
>>>
>>> Yes, and address_space_rw() already deals with conversion to
>>> that device's specified endianness.
>=20
>> Yes, but incorrectly if target endian !=3D host endian.
>> For example, LE target and LE device on BE host.
>=20
> Having walked through the code, got confused, talked to
> bonzini on IRC about it and got unconfused again,

Ah, *that discussion*.  So it was yet another XY question, :) but for
the better because it also helped me abstract Michael's question.

Peter's analysis below summarizes the implementation very well.

 I believe
> we do get this correct.
>=20
>  * address_space_rw() takes a pointer to a pile of bytes
>  * if the destination is RAM, we just memcpy them (because
>    guest RAM is also a pile of bytes)
>  * if the destination is a device, then we read a value
>    out of the pile of bytes at whatever width the target
>    device can handle. The functions we use for this are
>    ldl_q/ldl_p/etc, which do "load target endianness"
>    (ie "interpret this set of 4 bytes as if it were an
>    integer in the target-endianness") because the API of
>    memory_region_dispatch_write() is that it takes a uint64_t
>    data whose contents are the value to write in target
>    endianness order. (This is regrettably undocumented.)

^^ And this is the part where "the endianness of the CPU->device
bus/link" enters the picture.  But it doesn't matter if the source is
instead another device.  What matters is that address_space_rw() manages
conversion from a pile of bytes, and the device doing DMA provides
that---a pile of bytes.

In the patch at the beginning of this thread, problems arose because
what you passed to address_space_write wasn't just a "pile of bytes"
coming from a network packet or a disk sector.  Instead, it was the
outcome of a previous conversion from "pile of bytes" to "bytes
representing an integer in little-endian format".  This conversion could
have possibly included a byteswap.

Once you have established that the bytes represent an integer the right
way to access them is to use ld*_p/st*_p and
address_space_ld*/address_space_st*.  This ensures that you do an even
number of further byteswaps; for *_le_p and address_space_*_le, there
will be 0 further byteswaps on little-endian hosts and 2 on big-endian
hosts.

Paolo

>  * memory_region_dispatch_write() then calls adjust_endianness(),
>    converting a target-endian value to the endianness the
>    device says it requires
>  * we then call the device's read/write functions, whose API
>    is that they get a value in the endianness they asked for.
>=20
>> IO callbacks always get a native endian format so they expect to get
>> byte 0 of the buffer in MSB on this host.
>=20
> IO callbacks get the format they asked for (which might
> be BE, LE or target endianness). They will get byte 0 of
> the buffer in the MSB if they said they were BE devices
> (or if they said they were target-endian on a BE target).