public inbox for linux-arch@vger.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Cc: "linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>,
	"ard.biesheuvel@linaro.org" <ard.biesheuvel@linaro.org>,
	"Vineet.Gupta1@synopsys.com" <Vineet.Gupta1@synopsys.com>,
	Alexey Brodkin <Alexey.Brodkin@synopsys.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"jbaron@akamai.com" <jbaron@akamai.com>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	"linux-snps-arc@lists.infradead.org"
	<linux-snps-arc@lists.infradead.org>
Subject: Re: [PATCH] ARC: ARCv2: jump label: implement jump label patching
Date: Thu, 20 Jun 2019 23:30:57 +0200	[thread overview]
Message-ID: <20190620213057.GD3436@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <a945de7d3b6f2da03c62c9e1043e125b4c4211aa.camel@synopsys.com>

On Thu, Jun 20, 2019 at 06:34:55PM +0000, Eugeniy Paltsev wrote:
> On Thu, 2019-06-20 at 09:01 +0200, Peter Zijlstra wrote:

> > In particular we do not need the alignment.
> > 
> > So what the x86 code does is:
> > 
> >  - overwrite the first byte of the instruction with a single byte trap
> >    instruction
> > 
> >  - machine wide IPI which synchronizes I$
> > 
> > At this point, any CPU that encounters this instruction will trap; and
> > the trap handler will emulate the 'new' instruction -- typically a jump.
> > 
> >   - overwrite the tail of the instruction (if there is a tail)
> > 
> >   - machine wide IPI which syncrhonizes I$
> > 
> > At this point, nobody will execute the tail, because we'll still trap on
> > that first single byte instruction, but if they were to read the
> > instruction stream, the tail must be there.
> > 
> >   - overwrite the first byte of the instruction to now have a complete
> >     instruction.
> > 
> >   - machine wide IPI which syncrhonizes I$
> > 
> > At this point, any CPU will encounter the new instruction as a whole,
> > irrespective of alignment.
> > 
> > 
> > So the benefit of this scheme is that is works irrespective of the
> > instruction fetch window size and don't need the 'funny' alignment
> > stuff.
> > 
> 
> Thanks for explanation. Now I understand how this x86 magic works.
> 
> However it looks like even more complex than ARM implementation.
> As I understand on ARM they do something like that:
> ---------------------------->8-------------------------
> on_each_cpu {
> 	write_instruction
> 	flush_data_cache_region
> 	invalidate_instruction_cache_region
> }
> ---------------------------->8-------------------------
> 
> https://elixir.bootlin.com/linux/v5.1/source/arch/arm/kernel/patch.c#L121
> 
> Yep, there is some overhead - as we don't need to do white and D$ flush on each cpu
> but that makes code simple and avoids additional checks.
> 
> And I don't understand in which cases x86 approach with trap is better.
> In this ARM implementation we do one machine wide IPI instead of three in x86 trap approach.
> 
> Probably there is some x86 specifics I don't get?

It's about variable instruction length; ARM (RISC in general) doesn't
have that, ARC does.

Your current proposal works by keeping the instruction inside of the
i-fetch window, but that then results in instruction padding (extra
NOPs). And that is fine, it really should work.

The x86 approach however allows you to get rid of that padding and
should work for unaligned variable length instructions (we have 1-15
byte instructions).

I just wanted to make sure you were aware of the possiblities such that
you made an informed decision, I'm not trying to force complexity on you
:-)

WARNING: multiple messages have this Message-ID (diff)
From: Peter Zijlstra <peterz@infradead.org>
To: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Cc: "Vineet.Gupta1@synopsys.com" <Vineet.Gupta1@synopsys.com>,
	"jbaron@akamai.com" <jbaron@akamai.com>,
	"linux-snps-arc@lists.infradead.org"
	<linux-snps-arc@lists.infradead.org>,
	Alexey Brodkin <Alexey.Brodkin@synopsys.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	"ard.biesheuvel@linaro.org" <ard.biesheuvel@linaro.org>
Subject: Re: [PATCH] ARC: ARCv2: jump label: implement jump label patching
Date: Thu, 20 Jun 2019 23:30:57 +0200	[thread overview]
Message-ID: <20190620213057.GD3436@hirez.programming.kicks-ass.net> (raw)
Message-ID: <20190620213057.gxKgkDMqhG8DKCwhDiiEIys6ipVgEw0ZNbKx-5RgfmM@z> (raw)
In-Reply-To: <a945de7d3b6f2da03c62c9e1043e125b4c4211aa.camel@synopsys.com>

On Thu, Jun 20, 2019 at 06:34:55PM +0000, Eugeniy Paltsev wrote:
> On Thu, 2019-06-20 at 09:01 +0200, Peter Zijlstra wrote:

> > In particular we do not need the alignment.
> > 
> > So what the x86 code does is:
> > 
> >  - overwrite the first byte of the instruction with a single byte trap
> >    instruction
> > 
> >  - machine wide IPI which synchronizes I$
> > 
> > At this point, any CPU that encounters this instruction will trap; and
> > the trap handler will emulate the 'new' instruction -- typically a jump.
> > 
> >   - overwrite the tail of the instruction (if there is a tail)
> > 
> >   - machine wide IPI which syncrhonizes I$
> > 
> > At this point, nobody will execute the tail, because we'll still trap on
> > that first single byte instruction, but if they were to read the
> > instruction stream, the tail must be there.
> > 
> >   - overwrite the first byte of the instruction to now have a complete
> >     instruction.
> > 
> >   - machine wide IPI which syncrhonizes I$
> > 
> > At this point, any CPU will encounter the new instruction as a whole,
> > irrespective of alignment.
> > 
> > 
> > So the benefit of this scheme is that is works irrespective of the
> > instruction fetch window size and don't need the 'funny' alignment
> > stuff.
> > 
> 
> Thanks for explanation. Now I understand how this x86 magic works.
> 
> However it looks like even more complex than ARM implementation.
> As I understand on ARM they do something like that:
> ---------------------------->8-------------------------
> on_each_cpu {
> 	write_instruction
> 	flush_data_cache_region
> 	invalidate_instruction_cache_region
> }
> ---------------------------->8-------------------------
> 
> https://elixir.bootlin.com/linux/v5.1/source/arch/arm/kernel/patch.c#L121
> 
> Yep, there is some overhead - as we don't need to do white and D$ flush on each cpu
> but that makes code simple and avoids additional checks.
> 
> And I don't understand in which cases x86 approach with trap is better.
> In this ARM implementation we do one machine wide IPI instead of three in x86 trap approach.
> 
> Probably there is some x86 specifics I don't get?

It's about variable instruction length; ARM (RISC in general) doesn't
have that, ARC does.

Your current proposal works by keeping the instruction inside of the
i-fetch window, but that then results in instruction padding (extra
NOPs). And that is fine, it really should work.

The x86 approach however allows you to get rid of that padding and
should work for unaligned variable length instructions (we have 1-15
byte instructions).

I just wanted to make sure you were aware of the possiblities such that
you made an informed decision, I'm not trying to force complexity on you
:-)

  parent reply	other threads:[~2019-06-20 21:30 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-14 16:40 [PATCH] ARC: ARCv2: jump label: implement jump label patching Eugeniy Paltsev
2019-06-14 16:40 ` Eugeniy Paltsev
2019-06-18 16:16 ` Vineet Gupta
2019-06-18 16:16   ` Vineet Gupta
2019-06-19  8:12   ` Peter Zijlstra
2019-06-19  8:12     ` Peter Zijlstra
2019-06-19 23:55     ` Vineet Gupta
2019-06-19 23:55       ` Vineet Gupta
2019-06-20  7:01       ` Peter Zijlstra
2019-06-20  7:01         ` Peter Zijlstra
2019-06-20 18:34         ` Eugeniy Paltsev
2019-06-20 18:34           ` Eugeniy Paltsev
2019-06-20 21:30           ` Peter Zijlstra [this message]
2019-06-20 21:30             ` Peter Zijlstra
2019-06-20 18:48         ` Vineet Gupta
2019-06-20 18:48           ` Vineet Gupta
2019-06-20 21:22           ` Peter Zijlstra
2019-06-20 21:22             ` Peter Zijlstra
2019-06-21 12:09             ` Peter Zijlstra
2019-06-21 12:09               ` Peter Zijlstra
2019-06-21 12:12               ` Peter Zijlstra
2019-06-21 12:12                 ` Peter Zijlstra
2019-06-21 18:37                 ` Nadav Amit
2019-06-21 18:37                   ` Nadav Amit
2019-06-20  7:15       ` Peter Zijlstra
2019-06-20  7:15         ` Peter Zijlstra
2019-06-20  7:21       ` Peter Zijlstra
2019-06-20  7:21         ` Peter Zijlstra
2019-06-20  7:52       ` Peter Zijlstra
2019-06-20  7:52         ` Peter Zijlstra
2019-06-20 20:49         ` Vineet Gupta
2019-06-20 20:49           ` Vineet Gupta
2019-06-21 15:39           ` Alexey Brodkin
2019-06-21 15:39             ` Alexey Brodkin
2019-06-20 18:34     ` Eugeniy Paltsev
2019-06-20 18:34       ` Eugeniy Paltsev
2019-06-20 21:12       ` Peter Zijlstra
2019-06-20 21:12         ` Peter Zijlstra
2019-06-28 22:59   ` Vineet Gupta
2019-06-28 22:59     ` Vineet Gupta
2019-07-03 16:15   ` Vineet Gupta
2019-07-03 16:15     ` Vineet Gupta
2019-07-17 15:09   ` Eugeniy Paltsev
2019-07-17 15:09     ` Eugeniy Paltsev
2019-07-17 17:45     ` Vineet Gupta
2019-07-17 17:45       ` Vineet Gupta
2019-07-17 18:54       ` Eugeniy Paltsev
2019-07-17 18:54         ` Eugeniy Paltsev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190620213057.GD3436@hirez.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=Alexey.Brodkin@synopsys.com \
    --cc=Eugeniy.Paltsev@synopsys.com \
    --cc=Vineet.Gupta1@synopsys.com \
    --cc=ard.biesheuvel@linaro.org \
    --cc=jbaron@akamai.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-snps-arc@lists.infradead.org \
    --cc=pbonzini@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox