From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.33) id 1Cm4YZ-0005zw-SL for qemu-devel@nongnu.org; Wed, 05 Jan 2005 01:22:07 -0500 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.33) id 1Cm4YY-0005zO-Rv for qemu-devel@nongnu.org; Wed, 05 Jan 2005 01:22:07 -0500 Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.33) id 1Cm4YY-0005zE-DY for qemu-devel@nongnu.org; Wed, 05 Jan 2005 01:22:06 -0500 Received: from [64.233.184.205] (helo=wproxy.gmail.com) by monty-python.gnu.org with esmtp (Exim 4.34) id 1Cm4DS-0007yb-E0 for qemu-devel@nongnu.org; Wed, 05 Jan 2005 01:00:18 -0500 Received: by wproxy.gmail.com with SMTP id 68so108691wri for ; Tue, 04 Jan 2005 22:00:16 -0800 (PST) Message-ID: Date: Wed, 5 Jan 2005 01:00:15 -0500 From: Karl Magdsick Subject: Re: [Qemu-devel] Endian and userspace issues In-Reply-To: Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit References: <200501042016.03910.paul@codesourcery.com> Reply-To: Karl Magdsick , qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org The difficulty comes when, for instance, you have a struct containing a u_int16_t, followed by a int32_t[2], followed by a u_int8_t[2]. You pass a pointer to this struct to an AES or TwoFish encryption implementation that takes a void*. Internally, this void* is treated as a u_int32_t*. If the struct were little endian, the most significant byte of the u_int16_t would end up being in the middle of the first 32-bit word as viewed by the encryption function. The trouble is that the necessary type casts in C/C++ don't follow through to the machine code. You can guess where the typecasts come, but it becomes difficult. Now, the way one would first think to implement x86 emulation on a big-endian architecture would be to implement strait-forward access of int8_ts and insert thunks for int16_ts, int32_ts, floats, doubles, and long doubles. In other words, the "normal" way to emulate a CPU on top of a CPU of the oposite endianess will result in code optimized for 8-bit memory accesses. I imagine that the VirtualPC "big-endian x86" implementation really isn't big-endian, but it instead optimizes the memory layout for aligned 32-bit access. I imagine the memory is viewed as an array of alligned 32-bit big-endian words instead of an array of bytes. Loading (and storing) int32_ts and floats requires no endian changes as long as the memory accesses are 32-bit aligned. Loading (or storing) an int8_t would require a simple thunk of XORing its address with 0x3 and loading an aligned int16_t would require a simple thunk of XORing its address with 0x2. Doubles, long doubles, and unaligned accesses require more complicated thunks. However, aligned int32_ts are the fastest accesses on 32-bit x86 CPUs, so performance tuned code will use aligned int32_ts wherever possible. Optimizing memory access for aligned 32-bit access is likely a net positive gain. If you're emulating an architecture that doesn't allow unaligned memory accesses, then the gains are even greater. Called native libraries would still need to be aware of the non-standard memory layout, or else the emulatior will need to have knowledge of the APIs and insert extra thunks in the function calls between emulated and native code. If you were emulating only user-space, I imagine you could insert alternative implementations of calloc() and malloc() that set aside some space for access accounting. You could then see if the first access to the allocated memory was an aligned 32-bit access and mark the allocated buffer as optimized for 32-bit aligned access, otherwise you would use "normal" emulation. The accounting overhead and complexity would likely make this "mixed endian memory layout" emulation more trouble than it's worth. With system emulation, you could do something similar with accounting on a per-page or per-kilobyte basis. -Karl On Tue, 4 Jan 2005 20:17:13 -0800, anarkhos@vfemail.net wrote: > At 8:11 PM -0800 1/4/05, John Davidorff Pell wrote: > >I think that part of what he is suggesting is that the code that is little endian be translated to Big endian before execution. This would make the running binary "native" in memory, and so could continue to be closely integrated with its linked libraries. > > Yes, that was what I was suggesting. If this were done I don't even see the need for thunks at all. > > > _______________________________________________ > Qemu-devel mailing list > Qemu-devel@nongnu.org > http://lists.nongnu.org/mailman/listinfo/qemu-devel >