From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([140.186.70.92]:38597)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <rth7680@gmail.com>) id 1QTfvC-0000gT-11
	for qemu-devel@nongnu.org; Mon, 06 Jun 2011 15:57:11 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <rth7680@gmail.com>) id 1QTfvA-0002ek-8k
	for qemu-devel@nongnu.org; Mon, 06 Jun 2011 15:57:09 -0400
Received: from mail-yx0-f173.google.com ([209.85.213.173]:55956)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <rth7680@gmail.com>) id 1QTfv9-0002eQ-Ok
	for qemu-devel@nongnu.org; Mon, 06 Jun 2011 15:57:07 -0400
Received: by yxj19 with SMTP id 19so1325661yxj.4
	for <qemu-devel@nongnu.org>; Mon, 06 Jun 2011 12:57:06 -0700 (PDT)
Sender: Richard Henderson <rth7680@gmail.com>
Message-ID: <4DED3109.1070902@twiddle.net>
Date: Mon, 06 Jun 2011 14:56:57 -0500
From: Richard Henderson <rth@twiddle.net>
MIME-Version: 1.0
References: <1307370348-28400-1-git-send-email-pbonzini@redhat.com>
	<1307370348-28400-4-git-send-email-pbonzini@redhat.com>
In-Reply-To: <1307370348-28400-4-git-send-email-pbonzini@redhat.com>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] [PATCH v2 3/3] move unaligned memory access
 functions to bswap.h
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: qemu-devel@nongnu.org

Patches 1-3:
Reviewed-by: Richard Henderson <rth@twiddle.net>

That said,

On 06/06/2011 09:25 AM, Paolo Bonzini wrote:
> +/* conservative code for little endian unaligned accesses */
> +static inline int lduw_le_p(const void *ptr)
> +{
> +#ifdef _ARCH_PPC
> +    int val;
> +    __asm__ __volatile__ ("lhbrx %0,0,%1" : "=r" (val) : "r" (ptr));
> +    return val;
> +#else
> +    const uint8_t *p = ptr;
> +    return p[0] | (p[1] << 8);
> +#endif
> +}

Can we add a patch 4/3 that removes this sort of hard-coded
assembly stuff in favour of generic gcc code.  E.g.

static inline int lduw_le_p(const void *ptr)
{
  struct pack { uint16_t x __attribute__((packed)); };
  const struct pack *p = ptr;
  uint16_t ret;

  ret = p->x;
  ret = le_bswap(ret, 16);

  return ret;
}

One could fairly well macroize all 6 instances.

The compiler knows that ppc and i386 can do the unaligned loads.
The compiler *should* be able to match the bswap_N patterns, and
thus also fold the unaligned load with the bswap for ppc.

I can confirm that gcc head does in fact do the right thing for
ppc32/64 here, as well as for i386/x86_64.  I don't have previous
versions of gcc checked out atm...

I havn't checked, but this *ought* to enable the load-asi bswap
instruction for sparcv9.  I also havn't checked what happens for
a target like sparcv8 that lacks both unaligned load and bswap,
to see that we don't simply double the number of shifts.  However,
I'd be tempted to file that as a gcc missed optimization bug.


r~