public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Mark Rutland <mark.rutland@arm.com>
To: Guo Ren <guoren@kernel.org>
Cc: Evgenii Shatokhin <e.shatokhin@yadro.com>,
	suagrfillet@gmail.com, andy.chiu@sifive.com,
	linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org,
	Guo Ren <guoren@linux.alibaba.com>,
	anup@brainfault.org, paul.walmsley@sifive.com,
	palmer@dabbelt.com, conor.dooley@microchip.com, heiko@sntech.de,
	rostedt@goodmis.org, mhiramat@kernel.org, jolsa@redhat.com,
	bp@suse.de, jpoimboe@kernel.org, linux@yadro.com
Subject: Re: [PATCH -next V7 0/7] riscv: Optimize function trace
Date: Tue, 7 Feb 2023 09:16:55 +0000	[thread overview]
Message-ID: <Y+IXB4xQ7ACQWC9U@FVFF77S0Q05N> (raw)
In-Reply-To: <CAJF2gTQ6U1vH79Mu53eQ-GVaFx36C-hEt9Qf6=_vAkHfmgFh1Q@mail.gmail.com>

On Tue, Feb 07, 2023 at 11:57:06AM +0800, Guo Ren wrote:
> On Mon, Feb 6, 2023 at 5:56 PM Mark Rutland <mark.rutland@arm.com> wrote:
> > The DYNAMIC_FTRACE_WITH_CALL_OPS patches should be in v6.3. They're currently
> > queued in the arm64 tree in the for-next/ftrace branch:
> >
> >   git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-next/ftrace
> >   https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git/
> >
> > ... and those *should* be in v6.3.
> Glade to hear that. Great!
> 
> >
> > Patches to imeplement DIRECT_CALLS atop that are in review at the moment:
> >
> >   https://lore.kernel.org/linux-arm-kernel/20230201163420.1579014-1-revest@chromium.org/
> Good reference. Thx for sharing.
> 
> >
> > ... and if riscv uses the CALL_OPS approach, I believe it can do much the same
> > there.
> >
> > If riscv wants to do a single atomic patch to each patch-site (to avoid
> > stop_machine()), then direct calls would always needs to bounce through the
> > ftrace_caller trampoline (and acquire the direct call from the ftrace_ops), but
> > that might not be as bad as it sounds -- from benchmarking on arm64, the bulk
> > of the overhead seen with direct calls is when using the list_ops or having to
> > do a hash lookup, and both of those are avoided with the CALL_OPS approach.
> > Calling directly from the patch-site is a minor optimization relative to
> > skipping that work.
> Yes, CALL_OPS could solve the PREEMPTION & stop_machine problems. I
> would follow up.
> 
> The difference from arm64 is that RISC-V is 16bit/32bit mixed
> instruction ISA, so we must keep ftrace_caller & ftrace_regs_caller in
> 2048 aligned. Then:

Where does the 2048-bit alignment requirement come from?

Note that I'm assuming you will *always* go through a common ftrace_caller
trampoline (even for direct calls), with the trampoline responsible for
recovering the direct trampoline (or ops->func) from the ops pointer.

That would only require 64-bit alignment on 64-bit (or 32-bit alignment on
32-bit) to keep the literal naturally-aligned; the rest of the instructions
wouldn't require additional alignment.

For example, I would expect that (for 64-bit) you'd use:

  # place 2 NOPs *immediately before* the function, and 3 NOPs at the start
  -fpatchable-function-entry=5,2

  # Align the function to 8-bytes
  -falign=functions=8

... and your trampoline in each function could be initialized to:

  # Note: aligned to 8 bytes
  addr-08		// Literal (first 32-bits)	// set to ftrace_nop_ops
  addr-04		// Literal (last 32-bits)	// set to ftrace_nop_ops
  addr+00	func:	mv	t0, ra
  addr+04		auipc	t1, ftrace_caller
  addr+08		nop

... and when enabled can be set to:

  # Note: aligned to 8 bytes
  addr-08		// Literal (first 32-bits)	// patched to ops ptr
  addr-04		// Literal (last 32-bits)	// patched to ops ptr
  addr+00	func:	mv	t0, ra
  addr+04		auipc	t1, ftrace_caller
  addr+08		jalr	ftrace_caller(t1)

Note: this *only* requires patching the literal and NOP<->JALR; the MV and
AUIPC aren't harmful and can always be there. This way, you won't need to use
stop_machine().

With that, the ftrace_caller trampoline can recover the `ops` pointer at a
negative offset from `ra`, and can recover the instrumented function's return
address in `t0`. Using the `ops` pointer, it can figure out whether to branch
to a direct trampoline or whether to save/restore the regs around invoking
ops->func.

For 32-bit it would be exactly the same, except you'd only need a single nop
before the function, and the offset would be -0x10.

That's what arm64 does; the only difference is that riscv would *always* need
to go via the trampoline in order to make direct calls.

Thanks,
Mark.

  reply	other threads:[~2023-02-07  9:17 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-12  9:05 [PATCH -next V7 0/7] riscv: Optimize function trace guoren
2023-01-12  9:05 ` [PATCH -next V7 1/7] riscv: ftrace: Fixup panic by disabling preemption guoren
2023-01-12 12:16   ` Mark Rutland
2023-01-12 12:57     ` Mark Rutland
2023-01-28  9:45       ` Guo Ren
2023-01-28  9:37     ` Guo Ren
2023-01-30 10:54       ` Mark Rutland
2023-02-04  1:19         ` Guo Ren
2023-01-12  9:05 ` [PATCH -next V7 2/7] riscv: ftrace: Remove wasted nops for !RISCV_ISA_C guoren
2023-01-12  9:05 ` [PATCH -next V7 3/7] riscv: ftrace: Reduce the detour code size to half guoren
2023-01-16 14:11   ` Evgenii Shatokhin
2023-01-12  9:06 ` [PATCH -next V7 4/7] riscv: ftrace: Add ftrace_graph_func guoren
2023-01-12  9:06 ` [PATCH -next V7 5/7] riscv: ftrace: Add DYNAMIC_FTRACE_WITH_DIRECT_CALLS support guoren
2023-01-12  9:06 ` [PATCH -next V7 6/7] samples: ftrace: Add riscv support for SAMPLE_FTRACE_DIRECT[_MULTI] guoren
2023-01-16 14:30   ` Evgenii Shatokhin
2023-01-17  9:32     ` Song Shuai
2023-01-17 13:16       ` Evgenii Shatokhin
2023-01-17 16:22         ` Evgenii Shatokhin
2023-01-18  2:37           ` Song Shuai
2023-01-18 15:19             ` Evgenii Shatokhin
2023-01-19  6:05               ` Guo Ren
2023-02-18 21:30                 ` Palmer Dabbelt
2023-02-20  2:46                   ` Song Shuai
2023-02-21  3:56                     ` Guo Ren
2023-02-21  4:02                   ` Guo Ren
2023-01-12  9:06 ` [PATCH -next V7 7/7] riscv : select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY guoren
2023-01-16 15:02 ` [PATCH -next V7 0/7] riscv: Optimize function trace Evgenii Shatokhin
2023-02-04  6:40   ` Guo Ren
2023-02-06  9:56     ` Mark Rutland
2023-02-07  3:57       ` Guo Ren
2023-02-07  9:16         ` Mark Rutland [this message]
2023-02-08  2:30           ` Guo Ren
2023-02-08 14:46             ` Mark Rutland
2023-02-09  1:31               ` Guo Ren
2023-02-09 22:46                 ` David Laight
2023-02-10  2:18                   ` Guo Ren
2023-02-08 22:29             ` David Laight
2023-02-09  1:51               ` Guo Ren
2023-02-09  1:59                 ` Guo Ren
2023-02-09  9:54                   ` Mark Rutland
2023-02-10  2:21                     ` Guo Ren
2023-02-09  9:00                 ` David Laight
2023-02-09  9:11                   ` Guo Ren
2023-02-18 21:42 ` patchwork-bot+linux-riscv

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y+IXB4xQ7ACQWC9U@FVFF77S0Q05N \
    --to=mark.rutland@arm.com \
    --cc=andy.chiu@sifive.com \
    --cc=anup@brainfault.org \
    --cc=bp@suse.de \
    --cc=conor.dooley@microchip.com \
    --cc=e.shatokhin@yadro.com \
    --cc=guoren@kernel.org \
    --cc=guoren@linux.alibaba.com \
    --cc=heiko@sntech.de \
    --cc=jolsa@redhat.com \
    --cc=jpoimboe@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=linux@yadro.com \
    --cc=mhiramat@kernel.org \
    --cc=palmer@dabbelt.com \
    --cc=paul.walmsley@sifive.com \
    --cc=rostedt@goodmis.org \
    --cc=suagrfillet@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox