[U-Boot] [PATCH 2/8] armv7: cache maintenance operations for armv7

public inbox for u-boot@lists.denx.de
 help / color / mirror / Atom feed

From: Aneesh V <aneesh@ti.com>
To: u-boot@lists.denx.de
Subject: [U-Boot] [PATCH 2/8] armv7: cache maintenance operations for armv7
Date: Thu, 13 Jan 2011 16:40:18 +0530	[thread overview]
Message-ID: <4D2EDD9A.1000403@ti.com> (raw)
In-Reply-To: <4D2DFE77.8000104@free.fr>

On Thursday 13 January 2011 12:48 AM, Albert ARIBAUD wrote:
> (I realize I did not answer the other ones)
>
> Le 08/01/2011 11:06, Aneesh V a ?crit :
>
>>> Out of curiosity, can you elaborate on why the compiler would optimize
>>> better in these cases?
>>
>> While counting down the termination condition check is against 0. So
>> you can just decrement the loop count using a 'subs' and do a 'bne'.
>> When you count up you have to do a comparison with a non-zero value. So
>> you will need one 'cmp' instruction extra:-)
>
> I would not try to be too smart about what instructions are generated
> and how by a compiler such as gcc which has rather complex code
> generation optimizations.

IMHO, on ARM comparing with 0 is always going to be efficient than
comparing with a non-zero number for a termination condition, assuming
a decent compiler.

>
>> bigger loop inside because that reduces the frequency at which your
>> outer parameter changes and hence the overall number of instructions
>> executed. Consider this:
>> 1. We encode both the loop counts along with other data into a register
>> that is finally written to CP15 register.
>> 2. outer loop has the code for shifting and ORing the outer variable to
>> this register.
>> 3. Inner loop has the code for shifting and ORing the inner variable.
>> Step (3) has to be executed 'way x set' number of times anyways.
>> But having bigger loop inside makes sure that 2 is executed fewer times!
>

It's not a constant calculation. It's based on loop index. And this
optimization is not relying on compiler specifics. This is a logic
level optimization. It should generally give good results with all
compilers. Perhaps I was wrong in stating that it helps in getting
better assembly. It just helps in better run-time efficiency.

 > Here too it seems like you're underestimating the compiler's optimizing
 > capabilities -- your explanation seems to amount to extracting a
 > constant calculation from a loop, something that is rather usual in code
 > optimizing.

Actually, in my experience(in this same context) GCC does a terrible
job at this! For instance:

+		for (set = num_sets - 1; set >= 0; set--) {
+			setway = (level << 1) | (set << log2_line_len) |
+				 (way << way_shift);

Here, way_shift = 32 - log2_num_ways

But if you substitute way_shift with the latter, GCC will put the 
subtraction instruction inside the loop! - where as it is clearly loop 
invariant. So, I had to move it explicitly out of the loop!
In fact, I was thinking of giving this feedback to GCC.

>
>> With these tweaks the assembly code generated by this C code is as good
>> as the original hand-written assembly code with my compiler.
>
> How about other compilers?
>

I haven't tested other compilers. However, as I mentioned above the
latter one is a logic optimization. The former hopefully should help
all ARM compilers.

As you must be knowing, existing code for cache maintenance was in
assembly. When I moved it to C I wanted to make sure that the generated
code is as good as the original assembly for this critical piece of
code (I expected some criticism about moving it to C :-)). That's
why I checked the generated code and did these ,hopefully, minor tweaks
to make it better. I hope they don't have any serious drawbacks.

Best regards,
Aneesh

next prev parent reply	other threads:[~2011-01-13 11:10 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-12-22 11:54 [U-Boot] [PATCH 0/8] armv7: cache maintenance operations Aneesh V
2010-12-22 11:54 ` [U-Boot] [PATCH 1/8] arm: make default implementation of cache_flush() weakly linked Aneesh V
2011-01-08  6:40   ` Albert ARIBAUD
2010-12-22 11:54 ` [U-Boot] [PATCH 2/8] armv7: cache maintenance operations for armv7 Aneesh V
2011-01-08  6:36   ` Albert ARIBAUD
2011-01-08  8:40     ` Albert ARIBAUD
2011-01-08 10:06     ` Aneesh V
2011-01-12 19:18       ` Albert ARIBAUD
2011-01-13 11:10         ` Aneesh V [this message]
2011-01-13 12:14         ` Aneesh V
2011-01-13 17:12           ` Albert ARIBAUD
2011-01-08 13:17     ` Aneesh V
2011-01-08 14:06       ` Albert ARIBAUD
2011-01-09 22:41         ` Wolfgang Denk
2011-01-10  4:56           ` Aneesh V
2011-01-17 21:47             ` Wolfgang Denk
2011-01-12  9:08         ` Aneesh V
2011-01-12 19:23           ` Albert ARIBAUD
2011-01-13 12:05             ` Aneesh V
2011-01-13 13:14               ` Albert ARIBAUD
2011-01-13 14:30                 ` Aneesh V
2011-01-13 17:06                   ` Albert ARIBAUD
2011-03-01 11:54     ` Aneesh V
2011-03-01 13:36       ` Albert ARIBAUD
2010-12-22 11:54 ` [U-Boot] [PATCH 3/8] armv7: integrate cache maintenance support Aneesh V
2011-01-08  6:54   ` Albert ARIBAUD
2011-01-08  8:15     ` Aneesh V
2010-12-22 11:54 ` [U-Boot] [PATCH 4/8] arm: minor fixes for cache and mmu handling Aneesh V
2011-01-08  7:04   ` Albert ARIBAUD
2011-01-08  9:13     ` Aneesh V
2010-12-22 11:54 ` [U-Boot] [PATCH 5/8] armv7: add PL310 support to u-boot Aneesh V
2011-01-09 22:48   ` Wolfgang Denk
2011-01-10 13:41     ` Aneesh V
2010-12-22 11:54 ` [U-Boot] [PATCH 6/8] armv7: adapt omap4 to the new cache maintenance framework Aneesh V
2011-01-09 22:52   ` Wolfgang Denk
2011-01-10 14:33     ` Aneesh V
2011-01-17 21:52       ` Wolfgang Denk
2010-12-22 11:54 ` [U-Boot] [PATCH 7/8] armv7: adapt omap3 " Aneesh V
2011-01-09 22:57   ` Wolfgang Denk
2011-01-10 14:41     ` Aneesh V
2011-01-17 21:55       ` Wolfgang Denk
2011-01-18  5:31         ` Aneesh V
2011-01-18  9:23           ` Wolfgang Denk
2010-12-22 11:54 ` [U-Boot] [PATCH 8/8] armv7: adapt s5pc1xx " Aneesh V
2010-12-27  7:25   ` Minkyu Kang
2010-12-27 11:22     ` Aneesh V
2011-01-07  5:27       ` Minkyu Kang
2010-12-23  4:53 ` [U-Boot] [PATCH 0/8] armv7: cache maintenance operations Steve Sakoman
2010-12-28 19:51 ` Paulraj, Sandeep
2011-01-08  7:07   ` Albert ARIBAUD

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D2EDD9A.1000403@ti.com \
    --to=aneesh@ti.com \
    --cc=u-boot@lists.denx.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox