Re: [RFC PATCH 1/2] arch/m68k/lib/mulsi3.S: Optimize]

public inbox for linux-m68k@lists.linux-m68k.org
 help / color / mirror / Atom feed

From: Greg Ungerer <gregungerer@westnet.com.au>
To: George Spelvin <linux@horizon.com>,
	geert@linux-m68k.org, linux-m68k@lists.linux-m68k.org
Subject: Re: [RFC PATCH 1/2] arch/m68k/lib/mulsi3.S: Optimize]
Date: Thu, 12 May 2016 22:46:22 +1000	[thread overview]
Message-ID: <57347B1E.9030108@westnet.com.au> (raw)
In-Reply-To: <20160511102414.3631.qmail@ns.horizon.com>

Hi George,

Comments inline.


On 11/05/16 20:24, George Spelvin wrote:
> The __mulsi3 function implements 32x32-bit multiply when the processor
> (mc68000 or mc68010) does not have native support.
>
> This rewrite shrinks the code from 16 words to 13 (32 bytes to 26), and
> eliminates one stack frame access, saving in total 4 words and 16 cycles.
>
> The improvement is small (about 1/16 of the cost of the call overall),
> but multiply is a popular enough operation that it's nice to have.
>
> One open question is why the code includes a ColdFire option.
> Every ColdFire processor back to the original ISA A includes a 32x32-bit
> multiply, and so does not need this routine at all.
>
> Signed-off-by: George Spelvin <linux@horizon.com>
> ---
>   arch/m68k/lib/mulsi3.S | 132 ++++++++++++-------------------------------------
>   1 file changed, 31 insertions(+), 101 deletions(-)
>
> diff --git a/arch/m68k/lib/mulsi3.S b/arch/m68k/lib/mulsi3.S
> index c39ad4e..b4204fc 100644
> --- a/arch/m68k/lib/mulsi3.S
> +++ b/arch/m68k/lib/mulsi3.S
> @@ -1,105 +1,35 @@
> -/* libgcc1 routines for 68000 w/o floating-point hardware.
> -   Copyright (C) 1994, 1996, 1997, 1998 Free Software Foundation, Inc.
> -
> -This file is part of GNU CC.
> -
> -GNU CC is free software; you can redistribute it and/or modify it
> -under the terms of the GNU General Public License as published by the
> -Free Software Foundation; either version 2, or (at your option) any
> -later version.
> -
> -In addition to the permissions in the GNU General Public License, the
> -Free Software Foundation gives you unlimited permission to link the
> -compiled version of this file with other programs, and to distribute
> -those programs without any restriction coming from the use of this
> -file.  (The General Public License restrictions do apply in other
> -respects; for example, they cover modification of the file, and
> -distribution when not linked into another program.)
> -
> -This file is distributed in the hope that it will be useful, but
> -WITHOUT ANY WARRANTY; without even the implied warranty of
> -MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> -General Public License for more details. */
> -
> -/* As a special exception, if you link this library with files
> -   compiled with GCC to produce an executable, this does not cause
> -   the resulting executable to be covered by the GNU General Public License.
> -   This exception does not however invalidate any other reasons why
> -   the executable file might be covered by the GNU General Public License.  */
> -
> -/* Use this one for any 680x0; assumes no floating point hardware.
> -   The trailing " '" appearing on some lines is for ANSI preprocessors.  Yuk.
> -   Some of this code comes from MINIX, via the folks at ericsson.
> -   D. V. Henkel-Wallace (gumby@cygnus.com) Fete Bastille, 1992
> -*/
> -
> -/* These are predefined by new versions of GNU cpp.  */
> -
> -#ifndef __USER_LABEL_PREFIX__
> -#define __USER_LABEL_PREFIX__ _
> -#endif
> -
> -#ifndef __REGISTER_PREFIX__
> -#define __REGISTER_PREFIX__
> -#endif
> -
> -#ifndef __IMMEDIATE_PREFIX__
> -#define __IMMEDIATE_PREFIX__ #
> -#endif
> -
> -/* ANSI concatenation macros.  */
> -
> -#define CONCAT1(a, b) CONCAT2(a, b)
> -#define CONCAT2(a, b) a ## b
> -
> -/* Use the right prefix for global labels.  */
> -
> -#define SYM(x) CONCAT1 (__USER_LABEL_PREFIX__, x)
> -
> -/* Use the right prefix for registers.  */
> -
> -#define REG(x) CONCAT1 (__REGISTER_PREFIX__, x)
> -
> -/* Use the right prefix for immediate values.  */
> -
> -#define IMM(x) CONCAT1 (__IMMEDIATE_PREFIX__, x)
> -
> -#define d0 REG (d0)
> -#define d1 REG (d1)
> -#define d2 REG (d2)
> -#define d3 REG (d3)
> -#define d4 REG (d4)
> -#define d5 REG (d5)
> -#define d6 REG (d6)
> -#define d7 REG (d7)
> -#define a0 REG (a0)
> -#define a1 REG (a1)
> -#define a2 REG (a2)
> -#define a3 REG (a3)
> -#define a4 REG (a4)
> -#define a5 REG (a5)
> -#define a6 REG (a6)
> -#define fp REG (fp)
> -#define sp REG (sp)
> -
> +/*
> + * 32x32->32-bit multiply.  This is made up of three 16x16->32-bit
> + * partial products.
> + *
> + * Assuming the average argument has half its bits set, this takes
> + * 238 68000 cycles.
> + */
>   	.text
>   	.proc
> -	.globl	SYM (__mulsi3)
> -SYM (__mulsi3):
> -	movew	sp@(4), d0	/* x0 -> d0 */
> -	muluw	sp@(10), d0	/* x0*y1 */
> -	movew	sp@(6), d1	/* x1 -> d1 */
> -	muluw	sp@(8), d1	/* x1*y0 */
> -#if !(defined(__mcf5200__) || defined(__mcoldfire__))
> -	addw	d1, d0
> -#else
> -	addl	d1, d0
> -#endif
> +	.globl	___mulsi3
> +___mulsi3:
> +	lea	4(sp), a0

This syntax fails for me (using a binutils-2.25.1 based toolchain).
Registers must be prefixed with a "%", so here %sp and %a0.

   arch/m68k/lib/mulsi3.S: Assembler messages:
   arch/m68k/lib/mulsi3.S:12: Error: syntax error -- statement `lea 
4(sp),a0' ignored

That was compiling with just this one patch for a ColdFire target.

Regards
Greg


> +	move.w	(a0)+, d0	/* xhigh */
> +	move.w	(a0)+, d1	/* xlow */
> +	move.w	d1, a1
> +	mulu.w	(a0)+, d1	/* yhigh */
> +	mulu.w	(a0),  d0	/* ylow */
> +
> +#if defined(__mcf5200__) || defined(__mcoldfire__)
> +	/* Why do we need this?  All ColdFire have MULU.L */
> +	add.l	d0, d1
> +	move.w	a1, d0		/* xlow */
> +	mulu.w	(a0),d0		/* ylow */
> +	swap	d1
> +	clr.w	d1
> +	add.l	d1, d0
> +#else /* 68000 or 68010 */
> +	add.w	d0, d1
> +	move.w	a1, d0		/* xlow */
> +	mulu.w	(a0),d0		/* ylow */
>   	swap	d0
> -	clrw	d0
> -	movew	sp@(6), d1	/* x1 -> d1 */
> -	muluw	sp@(10), d1	/* x1*y1 */
> -	addl	d1, d0
> -
> +	add.w	d1, d0
> +	swap	d0
> +#endif
>   	rts
> -
>

next prev parent reply	other threads:[~2016-05-12 12:46 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-11 10:24 [RFC PATCH 1/2] arch/m68k/lib/mulsi3.S: Optimize] George Spelvin
2016-05-11 12:38 ` Greg Ungerer
2016-05-12  8:04   ` George Spelvin
2016-05-12  8:35     ` Andreas Schwab
2016-05-12 13:14     ` Greg Ungerer
2016-05-12 12:46 ` Greg Ungerer [this message]
2016-05-12 20:52   ` George Spelvin
2016-05-13  1:07     ` Greg Ungerer
2016-05-13  2:36       ` George Spelvin
2016-05-13  6:45         ` Greg Ungerer
2016-05-13  9:02           ` George Spelvin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=57347B1E.9030108@westnet.com.au \
    --to=gregungerer@westnet.com.au \
    --cc=geert@linux-m68k.org \
    --cc=linux-m68k@lists.linux-m68k.org \
    --cc=linux@horizon.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox