Re: [Xenomai-core] llimd. - Gilles Chanteperdrix

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Gilles Chanteperdrix <gilles.chanteperdrix@xenomai.org>
To: Jan Kiszka <jan.kiszka@domain.hid>
Cc: Xenomai core <Xenomai-core@domain.hid>
Subject: Re: [Xenomai-core] llimd.
Date: Fri, 31 Oct 2008 11:29:55 +0100	[thread overview]
Message-ID: <490ADE23.9000609@domain.hid> (raw)
In-Reply-To: <490AD875.2040101@domain.hid>

Gilles Chanteperdrix wrote:
> Jan Kiszka wrote:
>> Gilles Chanteperdrix wrote:
>>> Jan Kiszka wrote:
>>>> Gilles Chanteperdrix wrote:
>>>>> Hi Jan,
>>>>>
>>>>> I see that the implementation of rthal_llmulshft seems to account for
>>>>> the first argument sign. Does it work ? Namely, in the generic
>>>>> implementation will __rthal_u96shift propagate the sign bit ?
>>>> Yes, this works (given there is no overflow, of course). If you consider
>>>> a high word of 0xfffffff0 and a (right) shift of 8, we effectively cut
>>>> off all the leading 1s: high << (32-8) = 0xf0000000. But this only works
>>>> because we replace a right shift with a left shift (plus some OR'ing
>>>> later on). If we had to do a real right shift, we would also have to
>>>> take signed vs. unsigned into account (ie. shift in zeros or the sign
>>>> bit from the left?).
>>>>
>>>>> If yes, do you see a way llimd could be made to work the same way ? This
>>>>> way we would avoid inline ullimd twice in llimd code.
>>>> As the basic building block here is a multiplication, we cannot get
>>>> around telling apart signed from unsigned (or converting signed into
>>>> unsigned): the underlying multiplication logic is different.
>>>>
>>>> But what about this approach:
>>>>
>>>> static inline __attribute__((__const__)) long long
>>>> __rthal_generic_llimd (long long op, unsigned m, unsigned d)
>>>> {
>>>> 	int signed = 0;
>>>> 	long long ret;
>>>>
>>>> 	if (op < 0LL) {
>>>> 		op = -op;
>>>> 		signed = 1;
>>>> 	}
>>>> 	ret = __rthal_generic_ullimd(op, m, d);
>>>> 	return signed ? -ret : ret;
>>>> }
>>>>
>>>> However, I guess writing this in assembly for archs that suffer should
>>>> be more efficient.
>>> Hi Jan,
>>>
>>> You may have noticed that we played a bit with arithmetic operations
>>> (namely, we use an llimd without division to make the reverse of
>>> llmulshft), and it pays off on slow machines, such as ARM, where the
>>> division is done in software.
>>>
>>> At this chance, I looked at the code generated by this soluion, and I am
>>> not sure that it is better: on ARM, and I suspect this is true on other
>>> architectures, the operations needed to negate a long long clobbers the
>>> code conditions, which means we can not make these operations
>>> conditionals without a conditional jump, so the hand-coded assembler is
>>> not better than what the compiler does: it uses two conditional jumps
>>> whereas the original solution uses only one. Of course we could set sign
>>> to -1 or 1, and multiply by sign at the end, but the multiplication is
>>> probably even heavier than conditional jump.
>> Yes, on the archs that matter here (32-bit).
>>
>>> So, would you have any idea of a better solution ?
>> In an assembly version, one could save 'sign' in form of a jump target
>> that should be taken after __rthal_generic_ullimd (ie. jump to the
>> negation, or jump over it). Specifically when that address is kept in a
>> register, I think smart branch prediction units will be able to do the
>> right forecast.
> 
> Good idea, there is even a gcc extension which allows to do this in the
> generic section:
> 
> static inline __attribute__((__const__)) long long
> __rthal_generic_llimd (long long op, unsigned m, unsigned d)
> {
>  	void *epilogue;
>  	long long ret;
> 
>  	if (op < 0LL) {
>  		op = -op;
>  		epilogue = &&ret_neg;
>  	} else
> 		epilogue = &&ret_unchanged;
>  	ret = __rthal_generic_ullimd(op, m, d);
> 	goto *epilogue;
> ret_unchanged:
> 	return ret;
> ret_neg:
>  	return -ret;
> }

This works as expected on ARM, however, gcc 4.0 on x86 generates two
calls to __rthal_generic_ullimd with the indirect jump after each one.
It seems it has stopped half-way when "optimizing"...

-- 
                                                 Gilles.

next prev parent reply	other threads:[~2008-10-31 10:29 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-10-28 19:54 [Xenomai-core] llimd Gilles Chanteperdrix
2008-10-28 21:00 ` Jan Kiszka
2008-10-30 10:02   ` Gilles Chanteperdrix
2008-10-31  8:18     ` Jan Kiszka
2008-10-31 10:05       ` Gilles Chanteperdrix
2008-10-31 10:29         ` Gilles Chanteperdrix [this message]
2008-10-31 10:45           ` Gilles Chanteperdrix
2008-10-31 11:26             ` Jan Kiszka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=490ADE23.9000609@domain.hid \
    --to=gilles.chanteperdrix@xenomai.org \
    --cc=Xenomai-core@domain.hid \
    --cc=jan.kiszka@domain.hid \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.