From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:50387) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZC5Iz-0008J2-Ni for qemu-devel@nongnu.org; Mon, 06 Jul 2015 08:15:27 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZC5Iu-0002uO-MH for qemu-devel@nongnu.org; Mon, 06 Jul 2015 08:15:25 -0400 Received: from mx1.redhat.com ([209.132.183.28]:55873) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZC5Iu-0002sp-Fk for qemu-devel@nongnu.org; Mon, 06 Jul 2015 08:15:20 -0400 Date: Mon, 6 Jul 2015 15:15:16 +0300 From: "Michael S. Tsirkin" Message-ID: <20150706151227-mutt-send-email-mst@redhat.com> References: <559A3246.7020103@redhat.com> <20150706105048-mutt-send-email-mst@redhat.com> <559A4067.3060109@redhat.com> <20150706120539-mutt-send-email-mst@redhat.com> <20150706125811-mutt-send-email-mst@redhat.com> <20150706132538-mutt-send-email-mst@redhat.com> <559A6EBC.4010004@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <559A6EBC.4010004@redhat.com> Subject: Re: [Qemu-devel] [PATCH] virtio-pci: implement cfg capability List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paolo Bonzini Cc: Peter Maydell , =?iso-8859-1?Q?Herv=E9?= Poussineau , QEMU Developers On Mon, Jul 06, 2015 at 02:04:12PM +0200, Paolo Bonzini wrote: > > > On 06/07/2015 13:50, Peter Maydell wrote: > > On 6 July 2015 at 11:31, Michael S. Tsirkin wrote: > >> On Mon, Jul 06, 2015 at 11:04:24AM +0100, Peter Maydell wrote: > >>> On 6 July 2015 at 11:03, Michael S. Tsirkin wrote: > >>>> On Mon, Jul 06, 2015 at 10:11:18AM +0100, Peter Maydell wrote: > >>>>> But address_space_rw() is just the "memcpy bytes to the > >>>>> target's memory" operation -- if you have a pile of bytes > >>>>> then there are no endianness concerns. If you don't have > >>>>> a pile of bytes then you need to know the structure of > >>>>> the data you're DMAing around, and you should probably > >>>>> have a loop doing things with the specify-the-width functions. > >>> > >>>> Absolutely. But what if DMA happens to target another device > >>>> and not memory? Device needs some endian-ness so it needs > >>>> to be converted to that. > >>> > >>> Yes, and address_space_rw() already deals with conversion to > >>> that device's specified endianness. > > > >> Yes, but incorrectly if target endian != host endian. > >> For example, LE target and LE device on BE host. > > > > Having walked through the code, got confused, talked to > > bonzini on IRC about it and got unconfused again, > > Ah, *that discussion*. So it was yet another XY question, :) but for > the better because it also helped me abstract Michael's question. > > Peter's analysis below summarizes the implementation very well. > > I believe > > we do get this correct. > > > > * address_space_rw() takes a pointer to a pile of bytes > > * if the destination is RAM, we just memcpy them (because > > guest RAM is also a pile of bytes) > > * if the destination is a device, then we read a value > > out of the pile of bytes at whatever width the target > > device can handle. The functions we use for this are > > ldl_q/ldl_p/etc, which do "load target endianness" > > (ie "interpret this set of 4 bytes as if it were an > > integer in the target-endianness") because the API of > > memory_region_dispatch_write() is that it takes a uint64_t > > data whose contents are the value to write in target > > endianness order. (This is regrettably undocumented.) > > ^^ And this is the part where "the endianness of the CPU->device > bus/link" enters the picture. But it doesn't matter if the source is > instead another device. What matters is that address_space_rw() manages > conversion from a pile of bytes, and the device doing DMA provides > that---a pile of bytes. > > In the patch at the beginning of this thread, problems arose because > what you passed to address_space_write wasn't just a "pile of bytes" > coming from a network packet or a disk sector. Instead, it was the > outcome of a previous conversion from "pile of bytes" to "bytes > representing an integer in little-endian format". This conversion could > have possibly included a byteswap. Well, not really. It's a pile of bytes from guest POV. And same thing happens if you read a pile of bytes from RAM using address_space_read. > Once you have established that the bytes represent an integer the right > way to access them is to use ld*_p/st*_p and > address_space_ld*/address_space_st*. This ensures that you do an even > number of further byteswaps; for *_le_p and address_space_*_le, there > will be 0 further byteswaps on little-endian hosts and 2 on big-endian > hosts. > > Paolo I believe this summarizes the implementation correctly. I also argue that many devices use address_space_rw incorrectly assuming it converts from host endian, simply because most devices are written in host endian. > > * memory_region_dispatch_write() then calls adjust_endianness(), > > converting a target-endian value to the endianness the > > device says it requires > > * we then call the device's read/write functions, whose API > > is that they get a value in the endianness they asked for. > > > >> IO callbacks always get a native endian format so they expect to get > >> byte 0 of the buffer in MSB on this host. > > > > IO callbacks get the format they asked for (which might > > be BE, LE or target endianness). They will get byte 0 of > > the buffer in the MSB if they said they were BE devices > > (or if they said they were target-endian on a BE target).