public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jeremy Fitzhardinge <jeremy@goop.org>
To: Segher Boessenkool <segher@kernel.crashing.org>
Cc: LKML <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Ingo Molnar <mingo@elte.hu>,
	Christian Kujau <lists@nerdbynature.de>,
	Robert Hancock <hancockr@shaw.ca>,
	john stultz <johnstul@us.ibm.com>,
	Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [PATCH] common implementation of iterative div/mod
Date: Thu, 08 May 2008 22:57:34 +0100	[thread overview]
Message-ID: <4823774E.2040207@goop.org> (raw)
In-Reply-To: <e2d8bb81dc41cc3965a6ccba143f7cfa@kernel.crashing.org>

Segher Boessenkool wrote:
>> The workaround is to put a dummy asm statement in the loop to prevent
>> gcc from performing the transformation.
>
> It's not a "dummy" asm, it actually does something: it tells the
> compiler that it has to iterate the loop exactly as written, and
> not do something else.  I.e., exactly the behaviour we want here.

No, it has no function of its own.  It's bullying gcc into not 
performing an optimisation by giving the impression its doing something.

>> +    ticks = iter_div_u64_rem(blocked, NS_PER_TICK, &blocked);
>
> What a terrible function name.

It's consistent with the other functions defined here.  I agree it isn't 
pretty.  If you have a better suggestion, I'm all ears.

>> static inline void timespec_add_ns(struct timespec *a, u64 ns)
>> {
>> -    ns += a->tv_nsec;
>> -    while(unlikely(ns >= NSEC_PER_SEC)) {
>> -        /* The following asm() prevents the compiler from
>> -         * optimising this loop into a modulo operation.  */
>> -        asm("" : "+r"(ns));
>> -
>> -        ns -= NSEC_PER_SEC;
>> -        a->tv_sec++;
>> -    }
>> +    a->tv_sec += iter_div_u64_rem(a->tv_nsec + ns, NSEC_PER_SEC, &ns);
>>     a->tv_nsec = ns;
>> }
>
> ...and now the "meat" of this function isn't inline anymore.  If we
> cared about not doing a divide here, you'll have to explain why
> taking this trivial loop out of line is a good idea.

On x86-32 it compiles to 26 instructions and 47 bytes of code.  
Admittedly it might be smaller inline - or on a 64-bit machine - but I 
seriously doubt its suffering a huge performance hit from being out of 
line.  These days the inline threshold is very small - probably under 10 
instructions.  A direct call/return is essentially free, since it can be 
trivially prefetched.

>> +unsigned iter_div_u64_rem(u64 dividend, u32 divisor, u64 *remainder)
>> +{
>> +    unsigned ret = 0;
>> +
>> +    while(dividend >= divisor) {
>
> You removed the unlikely() here.  Why?

Because it didn't seem all that unlikely.  Besides, it makes not one bit 
of difference to the code generated by my compiler.

>> +        /* The following asm() prevents the compiler from
>> +           optimising this loop into a modulo operation.  */
>> +        asm("" : "+rm"(dividend));
>
> You changed "+r" to "+rm" here.  Why?  Also, "rm" is an x86-ism,
> and this is generic code (it does work here, but why is it better
> than "r"?)

"rm" isn't x86-specific.  I just wanted to give the compiler the freedom 
to put the value in either register or memory if it wanted to.

>> +EXPORT_SYMBOL(iter_div_u64_rem);
>
> Does this need to be exported?

Everything else in the file is exported.

> Can I suggest an alternative approach?  Define a macro (with a
> good, descriptive name!) for just the asm("" : "+r"(x)), and use
> that.  Much smaller patch, much clearer code, and doesn't result
> in different (and worse) code generation, so it's a much safer
> change.

Uh, could you suggest a name?  Something along the lines of 
prevent_gcc_from_strength_reducing_this_subtraction_loop_into_a_modulo_operation_thanks_oh_and_remember_to_use_it_in_all_the_right_places() 
springs to mind.

Rather than putting this unsightly (though with a smear of lipstick) 
hack into every open-coded iterative div-mod loop, we may as well 
implement it once and just look out for places which should be using it.

I don't think the "worse" code generation is much of an issue, since it 
isn't used anywhere performance critical, and moving the code out of 
line means there should be an overall reduction in code size.

    J

  reply	other threads:[~2008-05-08 21:57 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <fa.QTbvQYXhEm5VNP5dvkl5JG7NHYQ@ifi.uio.no>
2008-05-04 17:35 ` undefined reference to __udivdi3 (gcc-4.3) Robert Hancock
2008-05-04 22:19   ` Segher Boessenkool
2008-05-07  9:29     ` Jeremy Fitzhardinge
2008-05-08 15:16       ` [PATCH] common implementation of iterative div/mod Jeremy Fitzhardinge
2008-05-08 20:26         ` Andrew Morton
2008-05-08 22:00           ` Jeremy Fitzhardinge
2008-05-08 20:52         ` Segher Boessenkool
2008-05-08 21:57           ` Jeremy Fitzhardinge [this message]
2008-05-09 11:45         ` Christian Kujau
2008-05-14  6:46         ` Andrew Morton
2008-05-14  7:33           ` Jeremy Fitzhardinge
2008-05-14  8:33             ` Andi Kleen
2008-05-14  9:55               ` Jeremy Fitzhardinge
2008-05-14 10:50                 ` Andi Kleen
2008-05-14 10:52                   ` Jeremy Fitzhardinge
2008-05-14 11:21                     ` Andi Kleen
2008-05-14 12:58                       ` Jeremy Fitzhardinge

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4823774E.2040207@goop.org \
    --to=jeremy@goop.org \
    --cc=akpm@linux-foundation.org \
    --cc=hancockr@shaw.ca \
    --cc=johnstul@us.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lists@nerdbynature.de \
    --cc=mingo@elte.hu \
    --cc=segher@kernel.crashing.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox