All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pranith Kumar <bobby.prani@gmail.com>
To: Richard Henderson <rth@twiddle.net>
Cc: "open list:All patches CC here" <qemu-devel@nongnu.org>,
	serge.fdrv@gmail.com, alex.bennee@linaro.org
Subject: Re: [Qemu-devel] [RFC PATCH] tcg: Optimize fence instructions
Date: Tue, 19 Jul 2016 14:28:03 -0400	[thread overview]
Message-ID: <874m7lfpos.fsf@gmail.com> (raw)
In-Reply-To: <4ca466b4-2271-b08f-9766-cc944cfe8ba5@twiddle.net>


Richard Henderson writes:

> On 07/15/2016 01:59 AM, Pranith Kumar wrote:
>> This patch applies on top of the fence generation patch series.
>>
>> This commit optimizes fence instructions. Two optimizations are
>> currently implemented. These are:
>>
>> 1. Unnecessary duplicate fence instructions
>>
>>    If the same fence instruction is detected consecutively, we remove
>>    one instance of it.
>>
>>    ex: mb; mb => mb, strl; strl => strl
>>
>> 2. Merging weaker fence with subsequent/previous stronger fence
>>
>>    load-acquire/store-release fence can be combined with a full fence
>>    without relaxing the ordering constraint.
>>
>>    ex: a) ld; ldaq; mb => ld; mb
>>        b) mb; strl; st => mb; st
>>
>> Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
>> ---
>>  tcg/optimize.c | 59 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>  tcg/tcg.h      |  1 +
>>  2 files changed, 60 insertions(+)
>>
>> diff --git a/tcg/optimize.c b/tcg/optimize.c
>> index c0d975b..a655829 100644
>> --- a/tcg/optimize.c
>> +++ b/tcg/optimize.c
>> @@ -569,6 +569,63 @@ static bool swap_commutative2(TCGArg *p1, TCGArg *p2)
>>      return false;
>>  }
>>
>> +/* Eliminate duplicate and unnecessary fence instructions */
>> +void tcg_optimize_mb(TCGContext *s)
>> +{
>> +    int oi, oi_next;
>> +    TCGArg prev_op_mb = -1;
>> +    TCGOp *prev_op;
>> +
>> +    for (oi = s->gen_first_op_idx; oi >= 0; oi = oi_next) {
>> +        TCGOp *op = &s->gen_op_buf[oi];
>> +        TCGArg *args = &s->gen_opparam_buf[op->args];
>> +        TCGOpcode opc = op->opc;
>> +
>> +        switch (opc) {
>> +        case INDEX_op_mb:
>> +        {
>> +            TCGBar curr_mb_type = args[0] & 0xF0;
>> +            TCGBar prev_mb_type = prev_op_mb & 0xF0;
>
> No check to make sure that prev_op_mb != -1?
>

I don't think that is necessary since if prev_op_mb is -1 then prev_mb_type
will be 0x70 which will not match any of the cases we compare against below.

>> +
>> +            if (curr_mb_type == prev_mb_type ||
>> +                (curr_mb_type == TCG_BAR_STRL && prev_mb_type == TCG_BAR_SC)) {
>> +                /* Remove the current weaker barrier op. The previous
>> +                 * barrier is stronger and sufficient.
>> +                 * mb; strl => mb; st
>> +                 */
>> +                tcg_op_remove(s, op);
>> +            } else if (curr_mb_type == TCG_BAR_SC &&
>> +                       prev_mb_type == TCG_BAR_LDAQ) {
>> +                /* Remove the previous weaker barrier op. The current
>> +                 * barrier is stronger and sufficient.
>> +                 * ldaq; mb => ld; mb
>> +                 */
>> +                tcg_op_remove(s, prev_op);
>> +            } else if (curr_mb_type == TCG_BAR_STRL &&
>> +                       prev_mb_type == TCG_BAR_LDAQ) {
>> +                /* Consecutive load-acquire and store-release barriers
>> +                 * can be merged into one stronger SC barrier
>> +                 * ldaq; strl => ld; mb; st
>> +                 */
>> +                args[0] = (args[0] & 0x0F) | TCG_BAR_SC;
>> +                tcg_op_remove(s, prev_op);
>
> Don't you need to merge the bottom 4 bits as well?
>

That is an interesting point. Ideally, the TCG_MO_* flags for both the
barriers should be the same. Atleast now, for LDAR and STRL barriers this flag
is TCG_MO_ALL since we do not have any other variant.

>> +        default:
>> +            prev_op_mb = -1;
>> +            prev_op = NULL;
>> +            break;
>
> A reasonable start, I suppose, but there's nothing stopping you from 
> considering barriers that aren't exactly adjacent.
>
> You only need to reset prev_op_mb when you see qemu_ld, qemu_st, call, or 
> tcg_op_defs[opc].flags & TCG_OPF_BB_END.
>

That sounds like a good idea to extend this to. I will work on it.

Thanks!
-- 
Pranith

  reply	other threads:[~2016-07-19 18:28 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-14 20:29 [Qemu-devel] [RFC PATCH] tcg: Optimize fence instructions Pranith Kumar
2016-07-19  3:53 ` Richard Henderson
2016-07-19 18:28   ` Pranith Kumar [this message]
2016-07-19 14:36 ` Alex Bennée
2016-07-19 18:29   ` Pranith Kumar
2016-07-19 15:56 ` Alex Bennée
2016-07-19 18:33   ` Pranith Kumar
2016-07-19 17:16 ` Paolo Bonzini
2016-07-19 18:55   ` Pranith Kumar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=874m7lfpos.fsf@gmail.com \
    --to=bobby.prani@gmail.com \
    --cc=alex.bennee@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=rth@twiddle.net \
    --cc=serge.fdrv@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.