All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yao Zi <ziyao@disroot.org>
To: "Alexandre Ghiti" <alex@ghiti.fr>,
	"Andy Chiu" <andybnac@gmail.com>,
	alexghiti@rivosinc.com, palmer@dabbelt.com,
	"Andy Chiu" <andy.chiu@sifive.com>,
	"Björn Töpel" <bjorn@rivosinc.com>,
	"Mark Rutland" <mark.rutland@arm.com>,
	puranjay12@gmail.com, paul.walmsley@sifive.com,
	greentime.hu@sifive.com, nick.hu@sifive.com,
	nylon.chen@sifive.com, eric.lin@sifive.com,
	vicent.chen@sifive.com, zong.li@sifive.com,
	yongxuan.wang@sifive.com, samuel.holland@sifive.com,
	olivia.chu@sifive.com, c2232430@gmail.com
Cc: Han Gao <rabenda.cn@gmail.com>,
	Vivian Wang <wangruikang@iscas.ac.cn>,
	linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org,
	regressions@lists.linux.dev, linux-riscv@lists.infradead.org
Subject: Re: [REGRESSION] Random oops on SG2042 with Linux 6.16-rc and dynamic ftrace
Date: Wed, 2 Jul 2025 10:50:24 +0000	[thread overview]
Message-ID: <aGUO8L7oXpvEpvZo@pie.lan> (raw)
In-Reply-To: <b060e694-caa0-4aa5-ac67-75531a5f60eb@ghiti.fr>

On Tue, Jul 01, 2025 at 02:27:32PM +0200, Alexandre Ghiti wrote:
> Hi Yao,
> 
> On 7/1/25 08:41, Yao Zi wrote:
> > Linux v6.16 built with dynamic ftrace randomly oops or triggers
> > ftrace_bug() on Sophgo SG2042 when booting systemd-based userspace,

...

> > Not sure either reverting the commits or fixing them up is a better
> > idea, but anyway the fatal first issue shouidn't go into the stable
> > release.
> 
> Let's fix this, we were expecting issues with dynamic ftrace :)
> 
> So the following diff fixes all the issues you mentioned (not the first
> crash though, I'll let you test and see if it works better, I don't have
> this board):

Thanks for the fix! I've tested it with both QEMU and SG2042, it does
fix the lockdep failures as well as the boot time crash on SG2042. The
boot-time crash is caused by the race so will disappear as long as we
fix the race.

> diff --git a/arch/riscv/kernel/ftrace.c b/arch/riscv/kernel/ftrace.c
> index 4c6c24380cfd9..97ced537aa1e0 100644
> --- a/arch/riscv/kernel/ftrace.c
> +++ b/arch/riscv/kernel/ftrace.c
> @@ -14,6 +14,16 @@
>  #include <asm/text-patching.h>
> 
>  #ifdef CONFIG_DYNAMIC_FTRACE
> +void ftrace_arch_code_modify_prepare(void)
> +{
> +       mutex_lock(&text_mutex);
> +}
> +
> +void ftrace_arch_code_modify_post_process(void)
> +{
> +       mutex_unlock(&text_mutex);
> +}
> +
>  unsigned long ftrace_call_adjust(unsigned long addr)
>  {
>         if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS))
> @@ -29,10 +39,8 @@ unsigned long arch_ftrace_get_symaddr(unsigned long
> fentry_ip)
> 
>  void arch_ftrace_update_code(int command)
>  {
> -       mutex_lock(&text_mutex);
>         command |= FTRACE_MAY_SLEEP;
>         ftrace_modify_all_code(command);
> -       mutex_unlock(&text_mutex);
>         flush_icache_all();
>  }
> 
> @@ -149,16 +157,17 @@ int ftrace_init_nop(struct module *mod, struct
> dyn_ftrace *rec)
>         unsigned int nops[2], offset;
>         int ret;
> 
> +       mutex_lock(&text_mutex);

Besides using the guard API, could we swap the order between
ftrace_rec_set_nop_ops() and calculation of the nops array? This shrinks
the critical region a little.

With or without the change, here's my tag,

Tested-by: Yao Zi <ziyao@disroot.org>

and also

Reported-by: Han Gao <rabenda.cn@gmail.com>
Reported-by: Vivian Wang <wangruikang@iscas.ac.cn>

for their first-hand report of boot-time crash and analysis for the
first lock issue.

Regards,
Yao Zi

>         ret = ftrace_rec_set_nop_ops(rec);
>         if (ret)
> -               return ret;
> +               goto end;
> 
>         offset = (unsigned long) &ftrace_caller - pc;
>         nops[0] = to_auipc_t0(offset);
>         nops[1] = RISCV_INSN_NOP4;
> 
> -       mutex_lock(&text_mutex);
>         ret = patch_insn_write((void *)pc, nops, 2 * MCOUNT_INSN_SIZE);
> +end:
>         mutex_unlock(&text_mutex);
> 
>         return ret;
> 
> Andy is also taking a look, I'll let him confirm the above fix is correct.
> 
> Thanks for the thorough report!
> 
> Alex
> 
> 
> > 
> > Thanks for your suggestions on the problems.
> > 
> > Regards,
> > Yao Zi
> > 
> > [1]: https://lore.kernel.org/all/20250407180838.42877-1-andybnac@gmail.com/
> > 
> > #regzbot introduced: 881dadf0792c
> > 
> > _______________________________________________
> > linux-riscv mailing list
> > linux-riscv@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-riscv

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

WARNING: multiple messages have this Message-ID (diff)
From: Yao Zi <ziyao@disroot.org>
To: "Alexandre Ghiti" <alex@ghiti.fr>,
	"Andy Chiu" <andybnac@gmail.com>,
	alexghiti@rivosinc.com, palmer@dabbelt.com,
	"Andy Chiu" <andy.chiu@sifive.com>,
	"Björn Töpel" <bjorn@rivosinc.com>,
	"Mark Rutland" <mark.rutland@arm.com>,
	puranjay12@gmail.com, paul.walmsley@sifive.com,
	greentime.hu@sifive.com, nick.hu@sifive.com,
	nylon.chen@sifive.com, eric.lin@sifive.com,
	vicent.chen@sifive.com, zong.li@sifive.com,
	yongxuan.wang@sifive.com, samuel.holland@sifive.com,
	olivia.chu@sifive.com, c2232430@gmail.com
Cc: Han Gao <rabenda.cn@gmail.com>,
	Vivian Wang <wangruikang@iscas.ac.cn>,
	linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org,
	regressions@lists.linux.dev, linux-riscv@lists.infradead.org
Subject: Re: [REGRESSION] Random oops on SG2042 with Linux 6.16-rc and dynamic ftrace
Date: Wed, 2 Jul 2025 10:50:24 +0000	[thread overview]
Message-ID: <aGUO8L7oXpvEpvZo@pie.lan> (raw)
In-Reply-To: <b060e694-caa0-4aa5-ac67-75531a5f60eb@ghiti.fr>

On Tue, Jul 01, 2025 at 02:27:32PM +0200, Alexandre Ghiti wrote:
> Hi Yao,
> 
> On 7/1/25 08:41, Yao Zi wrote:
> > Linux v6.16 built with dynamic ftrace randomly oops or triggers
> > ftrace_bug() on Sophgo SG2042 when booting systemd-based userspace,

...

> > Not sure either reverting the commits or fixing them up is a better
> > idea, but anyway the fatal first issue shouidn't go into the stable
> > release.
> 
> Let's fix this, we were expecting issues with dynamic ftrace :)
> 
> So the following diff fixes all the issues you mentioned (not the first
> crash though, I'll let you test and see if it works better, I don't have
> this board):

Thanks for the fix! I've tested it with both QEMU and SG2042, it does
fix the lockdep failures as well as the boot time crash on SG2042. The
boot-time crash is caused by the race so will disappear as long as we
fix the race.

> diff --git a/arch/riscv/kernel/ftrace.c b/arch/riscv/kernel/ftrace.c
> index 4c6c24380cfd9..97ced537aa1e0 100644
> --- a/arch/riscv/kernel/ftrace.c
> +++ b/arch/riscv/kernel/ftrace.c
> @@ -14,6 +14,16 @@
>  #include <asm/text-patching.h>
> 
>  #ifdef CONFIG_DYNAMIC_FTRACE
> +void ftrace_arch_code_modify_prepare(void)
> +{
> +       mutex_lock(&text_mutex);
> +}
> +
> +void ftrace_arch_code_modify_post_process(void)
> +{
> +       mutex_unlock(&text_mutex);
> +}
> +
>  unsigned long ftrace_call_adjust(unsigned long addr)
>  {
>         if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS))
> @@ -29,10 +39,8 @@ unsigned long arch_ftrace_get_symaddr(unsigned long
> fentry_ip)
> 
>  void arch_ftrace_update_code(int command)
>  {
> -       mutex_lock(&text_mutex);
>         command |= FTRACE_MAY_SLEEP;
>         ftrace_modify_all_code(command);
> -       mutex_unlock(&text_mutex);
>         flush_icache_all();
>  }
> 
> @@ -149,16 +157,17 @@ int ftrace_init_nop(struct module *mod, struct
> dyn_ftrace *rec)
>         unsigned int nops[2], offset;
>         int ret;
> 
> +       mutex_lock(&text_mutex);

Besides using the guard API, could we swap the order between
ftrace_rec_set_nop_ops() and calculation of the nops array? This shrinks
the critical region a little.

With or without the change, here's my tag,

Tested-by: Yao Zi <ziyao@disroot.org>

and also

Reported-by: Han Gao <rabenda.cn@gmail.com>
Reported-by: Vivian Wang <wangruikang@iscas.ac.cn>

for their first-hand report of boot-time crash and analysis for the
first lock issue.

Regards,
Yao Zi

>         ret = ftrace_rec_set_nop_ops(rec);
>         if (ret)
> -               return ret;
> +               goto end;
> 
>         offset = (unsigned long) &ftrace_caller - pc;
>         nops[0] = to_auipc_t0(offset);
>         nops[1] = RISCV_INSN_NOP4;
> 
> -       mutex_lock(&text_mutex);
>         ret = patch_insn_write((void *)pc, nops, 2 * MCOUNT_INSN_SIZE);
> +end:
>         mutex_unlock(&text_mutex);
> 
>         return ret;
> 
> Andy is also taking a look, I'll let him confirm the above fix is correct.
> 
> Thanks for the thorough report!
> 
> Alex
> 
> 
> > 
> > Thanks for your suggestions on the problems.
> > 
> > Regards,
> > Yao Zi
> > 
> > [1]: https://lore.kernel.org/all/20250407180838.42877-1-andybnac@gmail.com/
> > 
> > #regzbot introduced: 881dadf0792c
> > 
> > _______________________________________________
> > linux-riscv mailing list
> > linux-riscv@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-riscv

  parent reply	other threads:[~2025-07-02 11:33 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-01  6:41 [REGRESSION] Random oops on SG2042 with Linux 6.16-rc and dynamic ftrace Yao Zi
2025-07-01  6:41 ` Yao Zi
2025-07-01  8:39 ` Masami Hiramatsu
2025-07-01  8:39   ` Masami Hiramatsu
2025-07-01 12:35   ` Andy Chiu
2025-07-01 12:35     ` Andy Chiu
2025-07-01 12:27 ` Alexandre Ghiti
2025-07-01 12:27   ` Alexandre Ghiti
2025-07-01 15:21   ` Steven Rostedt
2025-07-01 15:21     ` Steven Rostedt
2025-07-02 10:50   ` Yao Zi [this message]
2025-07-02 10:50     ` Yao Zi
2025-07-02 13:05     ` Alexandre Ghiti
2025-07-02 13:05       ` Alexandre Ghiti
2025-07-08  5:15       ` Yao Zi
2025-07-08  5:15         ` Yao Zi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aGUO8L7oXpvEpvZo@pie.lan \
    --to=ziyao@disroot.org \
    --cc=alex@ghiti.fr \
    --cc=alexghiti@rivosinc.com \
    --cc=andy.chiu@sifive.com \
    --cc=andybnac@gmail.com \
    --cc=bjorn@rivosinc.com \
    --cc=c2232430@gmail.com \
    --cc=eric.lin@sifive.com \
    --cc=greentime.hu@sifive.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=nick.hu@sifive.com \
    --cc=nylon.chen@sifive.com \
    --cc=olivia.chu@sifive.com \
    --cc=palmer@dabbelt.com \
    --cc=paul.walmsley@sifive.com \
    --cc=puranjay12@gmail.com \
    --cc=rabenda.cn@gmail.com \
    --cc=regressions@lists.linux.dev \
    --cc=samuel.holland@sifive.com \
    --cc=vicent.chen@sifive.com \
    --cc=wangruikang@iscas.ac.cn \
    --cc=yongxuan.wang@sifive.com \
    --cc=zong.li@sifive.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.