* Re: [PATCH] riscv: entry: Fixup do_trap_break from kernel side
@ 2023-07-04 16:40 ` Peter Zijlstra
0 siblings, 0 replies; 20+ messages in thread
From: Peter Zijlstra @ 2023-07-04 16:40 UTC (permalink / raw)
To: guoren
Cc: arnd, palmer, tglx, luto, conor.dooley, heiko, jszhang,
lazyparser, falcon, chenhuacai, apatel, atishp, mark.rutland,
bjorn, palmer, bjorn, daniel.thompson, linux-arch, linux-kernel,
linux-riscv, stable, Guo Ren
On Sat, Jul 01, 2023 at 10:57:07PM -0400, guoren@kernel.org wrote:
> From: Guo Ren <guoren@linux.alibaba.com>
>
> The irqentry_nmi_enter/exit would force the current context into in_interrupt.
> That would trigger the kernel to dead panic, but the kdb still needs "ebreak" to
> debug the kernel.
>
> Move irqentry_nmi_enter/exit to exception_enter/exit could correct handle_break
> of the kernel side.
This doesn't explain much if anything :/
I'm confused (probably because I don't know RISC-V very well), what's
EBREAK and how does it happen?
Specifically, if EBREAK can happen inside an local_irq_disable() region,
then the below change is actively wrong. Any exception/interrupt that
can happen while local_irq_disable() must be treated like an NMI.
If that makes kdb unhappy, fix kdb.
> Fixes: f0bddf50586d ("riscv: entry: Convert to generic entry")
> Reported-by: Daniel Thompson <daniel.thompson@linaro.org>
> Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
> Signed-off-by: Guo Ren <guoren@kernel.org>
> Cc: stable@vger.kernel.org
> ---
> arch/riscv/kernel/traps.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
> index efc6b649985a..ed0eb9452f9e 100644
> --- a/arch/riscv/kernel/traps.c
> +++ b/arch/riscv/kernel/traps.c
> @@ -18,6 +18,7 @@
> #include <linux/irq.h>
> #include <linux/kexec.h>
> #include <linux/entry-common.h>
> +#include <linux/context_tracking.h>
>
> #include <asm/asm-prototypes.h>
> #include <asm/bug.h>
> @@ -257,11 +258,11 @@ asmlinkage __visible __trap_section void do_trap_break(struct pt_regs *regs)
>
> irqentry_exit_to_user_mode(regs);
> } else {
> - irqentry_state_t state = irqentry_nmi_enter(regs);
> + enum ctx_state prev_state = exception_enter();
>
> handle_break(regs);
>
> - irqentry_nmi_exit(regs, state);
> + exception_exit(prev_state);
> }
> }
>
> --
> 2.36.1
>
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCH] riscv: entry: Fixup do_trap_break from kernel side
2023-07-04 16:40 ` Peter Zijlstra
@ 2023-07-04 17:34 ` Daniel Thompson
-1 siblings, 0 replies; 20+ messages in thread
From: Daniel Thompson @ 2023-07-04 17:34 UTC (permalink / raw)
To: Peter Zijlstra
Cc: guoren, arnd, palmer, tglx, luto, conor.dooley, heiko, jszhang,
lazyparser, falcon, chenhuacai, apatel, atishp, mark.rutland,
bjorn, palmer, bjorn, linux-arch, linux-kernel, linux-riscv,
stable, Guo Ren
On Tue, Jul 04, 2023 at 06:40:03PM +0200, Peter Zijlstra wrote:
> On Sat, Jul 01, 2023 at 10:57:07PM -0400, guoren@kernel.org wrote:
> > From: Guo Ren <guoren@linux.alibaba.com>
> >
> > The irqentry_nmi_enter/exit would force the current context into in_interrupt.
> > That would trigger the kernel to dead panic, but the kdb still needs "ebreak" to
> > debug the kernel.
> >
> > Move irqentry_nmi_enter/exit to exception_enter/exit could correct handle_break
> > of the kernel side.
>
> This doesn't explain much if anything :/
>
> I'm confused (probably because I don't know RISC-V very well), what's
> EBREAK and how does it happen?
Among other things ebreak is part of the BUG() macro (although it is
also used to programmatically enter kgdb).
> Specifically, if EBREAK can happen inside an local_irq_disable() region,
> then the below change is actively wrong. Any exception/interrupt that
> can happen while local_irq_disable() must be treated like an NMI.
>
> If that makes kdb unhappy, fix kdb.
The only relationship this problem has to kgdb/kdb is that is was found
using the kgdb test suite. However the panic is absolutely nothing to
do with kgdb.
I would never normally be so sure regarding the absence of bugs in kgdb
but in this case it can be reproduced when kgdb is not enabled in the
KConfig which I think puts it in the clear!
Reproduction is simply:
/bin/echo BUG > /sys/kernel/debug/provoke-crash/DIRECT
Above will panic the kernel but, absent options specifically requesting
a panic, this should kill the echo process rather than killing the kernel.
Daniel.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] riscv: entry: Fixup do_trap_break from kernel side
@ 2023-07-04 17:34 ` Daniel Thompson
0 siblings, 0 replies; 20+ messages in thread
From: Daniel Thompson @ 2023-07-04 17:34 UTC (permalink / raw)
To: Peter Zijlstra
Cc: guoren, arnd, palmer, tglx, luto, conor.dooley, heiko, jszhang,
lazyparser, falcon, chenhuacai, apatel, atishp, mark.rutland,
bjorn, palmer, bjorn, linux-arch, linux-kernel, linux-riscv,
stable, Guo Ren
On Tue, Jul 04, 2023 at 06:40:03PM +0200, Peter Zijlstra wrote:
> On Sat, Jul 01, 2023 at 10:57:07PM -0400, guoren@kernel.org wrote:
> > From: Guo Ren <guoren@linux.alibaba.com>
> >
> > The irqentry_nmi_enter/exit would force the current context into in_interrupt.
> > That would trigger the kernel to dead panic, but the kdb still needs "ebreak" to
> > debug the kernel.
> >
> > Move irqentry_nmi_enter/exit to exception_enter/exit could correct handle_break
> > of the kernel side.
>
> This doesn't explain much if anything :/
>
> I'm confused (probably because I don't know RISC-V very well), what's
> EBREAK and how does it happen?
Among other things ebreak is part of the BUG() macro (although it is
also used to programmatically enter kgdb).
> Specifically, if EBREAK can happen inside an local_irq_disable() region,
> then the below change is actively wrong. Any exception/interrupt that
> can happen while local_irq_disable() must be treated like an NMI.
>
> If that makes kdb unhappy, fix kdb.
The only relationship this problem has to kgdb/kdb is that is was found
using the kgdb test suite. However the panic is absolutely nothing to
do with kgdb.
I would never normally be so sure regarding the absence of bugs in kgdb
but in this case it can be reproduced when kgdb is not enabled in the
KConfig which I think puts it in the clear!
Reproduction is simply:
/bin/echo BUG > /sys/kernel/debug/provoke-crash/DIRECT
Above will panic the kernel but, absent options specifically requesting
a panic, this should kill the echo process rather than killing the kernel.
Daniel.
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] riscv: entry: Fixup do_trap_break from kernel side
2023-07-04 16:40 ` Peter Zijlstra
@ 2023-07-09 2:30 ` Guo Ren
-1 siblings, 0 replies; 20+ messages in thread
From: Guo Ren @ 2023-07-09 2:30 UTC (permalink / raw)
To: Peter Zijlstra
Cc: arnd, palmer, tglx, luto, conor.dooley, heiko, jszhang,
lazyparser, falcon, chenhuacai, apatel, atishp, mark.rutland,
bjorn, palmer, bjorn, daniel.thompson, linux-arch, linux-kernel,
linux-riscv, stable, Guo Ren
On Wed, Jul 5, 2023 at 12:40 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Sat, Jul 01, 2023 at 10:57:07PM -0400, guoren@kernel.org wrote:
> > From: Guo Ren <guoren@linux.alibaba.com>
> >
> > The irqentry_nmi_enter/exit would force the current context into in_interrupt.
> > That would trigger the kernel to dead panic, but the kdb still needs "ebreak" to
> > debug the kernel.
> >
> > Move irqentry_nmi_enter/exit to exception_enter/exit could correct handle_break
> > of the kernel side.
>
> This doesn't explain much if anything :/
>
> I'm confused (probably because I don't know RISC-V very well), what's
> EBREAK and how does it happen?
EBREAK is just an instruction of riscv which would rise breakpoint exception.
>
> Specifically, if EBREAK can happen inside an local_irq_disable() region,
> then the below change is actively wrong. Any exception/interrupt that
> can happen while local_irq_disable() must be treated like an NMI.
When the ebreak happend out of local_irq_disable region, but
__nmi_enter forces handle_break() into in_interupt() state. So how
about:
diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
index f910dfccbf5d..69f7043a98b9 100644
--- a/arch/riscv/kernel/traps.c
+++ b/arch/riscv/kernel/traps.c
@@ -18,6 +18,7 @@
#include <linux/irq.h>
#include <linux/kexec.h>
#include <linux/entry-common.h>
+#include <linux/context_tracking.h>
#include <asm/asm-prototypes.h>
#include <asm/bug.h>
@@ -285,12 +286,18 @@ asmlinkage __visible __trap_section void
do_trap_break(struct pt_regs *regs)
handle_break(regs);
irqentry_exit_to_user_mode(regs);
- } else {
+ } else if (in_interrupt()){
irqentry_state_t state = irqentry_nmi_enter(regs);
handle_break(regs);
irqentry_nmi_exit(regs, state);
+ } else {
+ enum ctx_state prev_state = exception_enter();
+
+ handle_break(regs);
+
+ exception_exit(prev_state);
}
}
>
> If that makes kdb unhappy, fix kdb.
>
> > Fixes: f0bddf50586d ("riscv: entry: Convert to generic entry")
> > Reported-by: Daniel Thompson <daniel.thompson@linaro.org>
> > Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
> > Signed-off-by: Guo Ren <guoren@kernel.org>
> > Cc: stable@vger.kernel.org
> > ---
> > arch/riscv/kernel/traps.c | 5 +++--
> > 1 file changed, 3 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
> > index efc6b649985a..ed0eb9452f9e 100644
> > --- a/arch/riscv/kernel/traps.c
> > +++ b/arch/riscv/kernel/traps.c
> > @@ -18,6 +18,7 @@
> > #include <linux/irq.h>
> > #include <linux/kexec.h>
> > #include <linux/entry-common.h>
> > +#include <linux/context_tracking.h>
> >
> > #include <asm/asm-prototypes.h>
> > #include <asm/bug.h>
> > @@ -257,11 +258,11 @@ asmlinkage __visible __trap_section void do_trap_break(struct pt_regs *regs)
> >
> > irqentry_exit_to_user_mode(regs);
> > } else {
> > - irqentry_state_t state = irqentry_nmi_enter(regs);
> > + enum ctx_state prev_state = exception_enter();
> >
> > handle_break(regs);
> >
> > - irqentry_nmi_exit(regs, state);
> > + exception_exit(prev_state);
> > }
> > }
> >
> > --
> > 2.36.1
> >
--
Best Regards
Guo Ren
^ permalink raw reply related [flat|nested] 20+ messages in thread* Re: [PATCH] riscv: entry: Fixup do_trap_break from kernel side
@ 2023-07-09 2:30 ` Guo Ren
0 siblings, 0 replies; 20+ messages in thread
From: Guo Ren @ 2023-07-09 2:30 UTC (permalink / raw)
To: Peter Zijlstra
Cc: arnd, palmer, tglx, luto, conor.dooley, heiko, jszhang,
lazyparser, falcon, chenhuacai, apatel, atishp, mark.rutland,
bjorn, palmer, bjorn, daniel.thompson, linux-arch, linux-kernel,
linux-riscv, stable, Guo Ren
On Wed, Jul 5, 2023 at 12:40 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Sat, Jul 01, 2023 at 10:57:07PM -0400, guoren@kernel.org wrote:
> > From: Guo Ren <guoren@linux.alibaba.com>
> >
> > The irqentry_nmi_enter/exit would force the current context into in_interrupt.
> > That would trigger the kernel to dead panic, but the kdb still needs "ebreak" to
> > debug the kernel.
> >
> > Move irqentry_nmi_enter/exit to exception_enter/exit could correct handle_break
> > of the kernel side.
>
> This doesn't explain much if anything :/
>
> I'm confused (probably because I don't know RISC-V very well), what's
> EBREAK and how does it happen?
EBREAK is just an instruction of riscv which would rise breakpoint exception.
>
> Specifically, if EBREAK can happen inside an local_irq_disable() region,
> then the below change is actively wrong. Any exception/interrupt that
> can happen while local_irq_disable() must be treated like an NMI.
When the ebreak happend out of local_irq_disable region, but
__nmi_enter forces handle_break() into in_interupt() state. So how
about:
diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
index f910dfccbf5d..69f7043a98b9 100644
--- a/arch/riscv/kernel/traps.c
+++ b/arch/riscv/kernel/traps.c
@@ -18,6 +18,7 @@
#include <linux/irq.h>
#include <linux/kexec.h>
#include <linux/entry-common.h>
+#include <linux/context_tracking.h>
#include <asm/asm-prototypes.h>
#include <asm/bug.h>
@@ -285,12 +286,18 @@ asmlinkage __visible __trap_section void
do_trap_break(struct pt_regs *regs)
handle_break(regs);
irqentry_exit_to_user_mode(regs);
- } else {
+ } else if (in_interrupt()){
irqentry_state_t state = irqentry_nmi_enter(regs);
handle_break(regs);
irqentry_nmi_exit(regs, state);
+ } else {
+ enum ctx_state prev_state = exception_enter();
+
+ handle_break(regs);
+
+ exception_exit(prev_state);
}
}
>
> If that makes kdb unhappy, fix kdb.
>
> > Fixes: f0bddf50586d ("riscv: entry: Convert to generic entry")
> > Reported-by: Daniel Thompson <daniel.thompson@linaro.org>
> > Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
> > Signed-off-by: Guo Ren <guoren@kernel.org>
> > Cc: stable@vger.kernel.org
> > ---
> > arch/riscv/kernel/traps.c | 5 +++--
> > 1 file changed, 3 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
> > index efc6b649985a..ed0eb9452f9e 100644
> > --- a/arch/riscv/kernel/traps.c
> > +++ b/arch/riscv/kernel/traps.c
> > @@ -18,6 +18,7 @@
> > #include <linux/irq.h>
> > #include <linux/kexec.h>
> > #include <linux/entry-common.h>
> > +#include <linux/context_tracking.h>
> >
> > #include <asm/asm-prototypes.h>
> > #include <asm/bug.h>
> > @@ -257,11 +258,11 @@ asmlinkage __visible __trap_section void do_trap_break(struct pt_regs *regs)
> >
> > irqentry_exit_to_user_mode(regs);
> > } else {
> > - irqentry_state_t state = irqentry_nmi_enter(regs);
> > + enum ctx_state prev_state = exception_enter();
> >
> > handle_break(regs);
> >
> > - irqentry_nmi_exit(regs, state);
> > + exception_exit(prev_state);
> > }
> > }
> >
> > --
> > 2.36.1
> >
--
Best Regards
Guo Ren
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply related [flat|nested] 20+ messages in thread* Re: [PATCH] riscv: entry: Fixup do_trap_break from kernel side
2023-07-09 2:30 ` Guo Ren
@ 2023-07-10 8:01 ` Peter Zijlstra
-1 siblings, 0 replies; 20+ messages in thread
From: Peter Zijlstra @ 2023-07-10 8:01 UTC (permalink / raw)
To: Guo Ren
Cc: arnd, palmer, tglx, luto, conor.dooley, heiko, jszhang,
lazyparser, falcon, chenhuacai, apatel, atishp, mark.rutland,
bjorn, palmer, bjorn, daniel.thompson, linux-arch, linux-kernel,
linux-riscv, stable, Guo Ren
On Sun, Jul 09, 2023 at 10:30:22AM +0800, Guo Ren wrote:
> On Wed, Jul 5, 2023 at 12:40 AM Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Sat, Jul 01, 2023 at 10:57:07PM -0400, guoren@kernel.org wrote:
> > > From: Guo Ren <guoren@linux.alibaba.com>
> > >
> > > The irqentry_nmi_enter/exit would force the current context into in_interrupt.
> > > That would trigger the kernel to dead panic, but the kdb still needs "ebreak" to
> > > debug the kernel.
> > >
> > > Move irqentry_nmi_enter/exit to exception_enter/exit could correct handle_break
> > > of the kernel side.
> >
> > This doesn't explain much if anything :/
> >
> > I'm confused (probably because I don't know RISC-V very well), what's
> > EBREAK and how does it happen?
> EBREAK is just an instruction of riscv which would rise breakpoint exception.
>
>
> >
> > Specifically, if EBREAK can happen inside an local_irq_disable() region,
> > then the below change is actively wrong. Any exception/interrupt that
> > can happen while local_irq_disable() must be treated like an NMI.
> When the ebreak happend out of local_irq_disable region, but
> __nmi_enter forces handle_break() into in_interupt() state. So how
And why is that a problem? I think I'm missing something fundamental
here...
> about:
>
> diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
> index f910dfccbf5d..69f7043a98b9 100644
> --- a/arch/riscv/kernel/traps.c
> +++ b/arch/riscv/kernel/traps.c
> @@ -18,6 +18,7 @@
> #include <linux/irq.h>
> #include <linux/kexec.h>
> #include <linux/entry-common.h>
> +#include <linux/context_tracking.h>
>
> #include <asm/asm-prototypes.h>
> #include <asm/bug.h>
> @@ -285,12 +286,18 @@ asmlinkage __visible __trap_section void
> do_trap_break(struct pt_regs *regs)
> handle_break(regs);
>
> irqentry_exit_to_user_mode(regs);
> - } else {
> + } else if (in_interrupt()){
> irqentry_state_t state = irqentry_nmi_enter(regs);
>
> handle_break(regs);
>
> irqentry_nmi_exit(regs, state);
> + } else {
> + enum ctx_state prev_state = exception_enter();
> +
> + handle_break(regs);
> +
> + exception_exit(prev_state);
> }
> }
That's wrong. If you want to make it conditional, you have to look at
!(regs->status & SR_IE) (that's the interrupt enable flag of the
interrupted context, right?)
When you hit an EBREAK when IRQs were disabled, you must be NMI like.
But making it conditional like this makes it really hard to write a
handler though, it basically must assume it will be NMI contetx (because
it can't know) so there is no point in sometimes not doing NMI context.
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCH] riscv: entry: Fixup do_trap_break from kernel side
@ 2023-07-10 8:01 ` Peter Zijlstra
0 siblings, 0 replies; 20+ messages in thread
From: Peter Zijlstra @ 2023-07-10 8:01 UTC (permalink / raw)
To: Guo Ren
Cc: arnd, palmer, tglx, luto, conor.dooley, heiko, jszhang,
lazyparser, falcon, chenhuacai, apatel, atishp, mark.rutland,
bjorn, palmer, bjorn, daniel.thompson, linux-arch, linux-kernel,
linux-riscv, stable, Guo Ren
On Sun, Jul 09, 2023 at 10:30:22AM +0800, Guo Ren wrote:
> On Wed, Jul 5, 2023 at 12:40 AM Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Sat, Jul 01, 2023 at 10:57:07PM -0400, guoren@kernel.org wrote:
> > > From: Guo Ren <guoren@linux.alibaba.com>
> > >
> > > The irqentry_nmi_enter/exit would force the current context into in_interrupt.
> > > That would trigger the kernel to dead panic, but the kdb still needs "ebreak" to
> > > debug the kernel.
> > >
> > > Move irqentry_nmi_enter/exit to exception_enter/exit could correct handle_break
> > > of the kernel side.
> >
> > This doesn't explain much if anything :/
> >
> > I'm confused (probably because I don't know RISC-V very well), what's
> > EBREAK and how does it happen?
> EBREAK is just an instruction of riscv which would rise breakpoint exception.
>
>
> >
> > Specifically, if EBREAK can happen inside an local_irq_disable() region,
> > then the below change is actively wrong. Any exception/interrupt that
> > can happen while local_irq_disable() must be treated like an NMI.
> When the ebreak happend out of local_irq_disable region, but
> __nmi_enter forces handle_break() into in_interupt() state. So how
And why is that a problem? I think I'm missing something fundamental
here...
> about:
>
> diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
> index f910dfccbf5d..69f7043a98b9 100644
> --- a/arch/riscv/kernel/traps.c
> +++ b/arch/riscv/kernel/traps.c
> @@ -18,6 +18,7 @@
> #include <linux/irq.h>
> #include <linux/kexec.h>
> #include <linux/entry-common.h>
> +#include <linux/context_tracking.h>
>
> #include <asm/asm-prototypes.h>
> #include <asm/bug.h>
> @@ -285,12 +286,18 @@ asmlinkage __visible __trap_section void
> do_trap_break(struct pt_regs *regs)
> handle_break(regs);
>
> irqentry_exit_to_user_mode(regs);
> - } else {
> + } else if (in_interrupt()){
> irqentry_state_t state = irqentry_nmi_enter(regs);
>
> handle_break(regs);
>
> irqentry_nmi_exit(regs, state);
> + } else {
> + enum ctx_state prev_state = exception_enter();
> +
> + handle_break(regs);
> +
> + exception_exit(prev_state);
> }
> }
That's wrong. If you want to make it conditional, you have to look at
!(regs->status & SR_IE) (that's the interrupt enable flag of the
interrupted context, right?)
When you hit an EBREAK when IRQs were disabled, you must be NMI like.
But making it conditional like this makes it really hard to write a
handler though, it basically must assume it will be NMI contetx (because
it can't know) so there is no point in sometimes not doing NMI context.
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCH] riscv: entry: Fixup do_trap_break from kernel side
2023-07-10 8:01 ` Peter Zijlstra
@ 2023-07-16 23:33 ` Guo Ren
-1 siblings, 0 replies; 20+ messages in thread
From: Guo Ren @ 2023-07-16 23:33 UTC (permalink / raw)
To: Peter Zijlstra
Cc: arnd, palmer, tglx, luto, conor.dooley, heiko, jszhang,
lazyparser, falcon, chenhuacai, apatel, atishp, mark.rutland,
bjorn, palmer, bjorn, daniel.thompson, linux-arch, linux-kernel,
linux-riscv, stable, Guo Ren
On Mon, Jul 10, 2023 at 4:02 PM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Sun, Jul 09, 2023 at 10:30:22AM +0800, Guo Ren wrote:
> > On Wed, Jul 5, 2023 at 12:40 AM Peter Zijlstra <peterz@infradead.org> wrote:
> > >
> > > On Sat, Jul 01, 2023 at 10:57:07PM -0400, guoren@kernel.org wrote:
> > > > From: Guo Ren <guoren@linux.alibaba.com>
> > > >
> > > > The irqentry_nmi_enter/exit would force the current context into in_interrupt.
> > > > That would trigger the kernel to dead panic, but the kdb still needs "ebreak" to
> > > > debug the kernel.
> > > >
> > > > Move irqentry_nmi_enter/exit to exception_enter/exit could correct handle_break
> > > > of the kernel side.
> > >
> > > This doesn't explain much if anything :/
> > >
> > > I'm confused (probably because I don't know RISC-V very well), what's
> > > EBREAK and how does it happen?
> > EBREAK is just an instruction of riscv which would rise breakpoint exception.
> >
> >
> > >
> > > Specifically, if EBREAK can happen inside an local_irq_disable() region,
> > > then the below change is actively wrong. Any exception/interrupt that
> > > can happen while local_irq_disable() must be treated like an NMI.
> > When the ebreak happend out of local_irq_disable region, but
> > __nmi_enter forces handle_break() into in_interupt() state. So how
>
> And why is that a problem? I think I'm missing something fundamental
> here...
The irqentry_nmi_enter() would force the current context to get
in_interrupt=true, although ebreak happens in the context which is
in_interrupt=false.
A lot of checking codes, such as:
if (in_interrupt())
panic("Fatal exception in interrupt");
It would make the kernel panic, but we don't panic; we want back to the shell.
eg:
echo BUG > /sys/kernel/debug/provoke-crash/DIRECT
>
> > about:
> >
> > diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
> > index f910dfccbf5d..69f7043a98b9 100644
> > --- a/arch/riscv/kernel/traps.c
> > +++ b/arch/riscv/kernel/traps.c
> > @@ -18,6 +18,7 @@
> > #include <linux/irq.h>
> > #include <linux/kexec.h>
> > #include <linux/entry-common.h>
> > +#include <linux/context_tracking.h>
> >
> > #include <asm/asm-prototypes.h>
> > #include <asm/bug.h>
> > @@ -285,12 +286,18 @@ asmlinkage __visible __trap_section void
> > do_trap_break(struct pt_regs *regs)
> > handle_break(regs);
> >
> > irqentry_exit_to_user_mode(regs);
> > - } else {
> > + } else if (in_interrupt()){
> > irqentry_state_t state = irqentry_nmi_enter(regs);
> >
> > handle_break(regs);
> >
> > irqentry_nmi_exit(regs, state);
> > + } else {
> > + enum ctx_state prev_state = exception_enter();
> > +
> > + handle_break(regs);
> > +
> > + exception_exit(prev_state);
> > }
> > }
>
> That's wrong. If you want to make it conditional, you have to look at
> !(regs->status & SR_IE) (that's the interrupt enable flag of the
> interrupted context, right?)
>
> When you hit an EBREAK when IRQs were disabled, you must be NMI like.
>
> But making it conditional like this makes it really hard to write a
> handler though, it basically must assume it will be NMI contetx (because
> it can't know) so there is no point in sometimes not doing NMI context.
--
Best Regards
Guo Ren
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCH] riscv: entry: Fixup do_trap_break from kernel side
@ 2023-07-16 23:33 ` Guo Ren
0 siblings, 0 replies; 20+ messages in thread
From: Guo Ren @ 2023-07-16 23:33 UTC (permalink / raw)
To: Peter Zijlstra
Cc: arnd, palmer, tglx, luto, conor.dooley, heiko, jszhang,
lazyparser, falcon, chenhuacai, apatel, atishp, mark.rutland,
bjorn, palmer, bjorn, daniel.thompson, linux-arch, linux-kernel,
linux-riscv, stable, Guo Ren
On Mon, Jul 10, 2023 at 4:02 PM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Sun, Jul 09, 2023 at 10:30:22AM +0800, Guo Ren wrote:
> > On Wed, Jul 5, 2023 at 12:40 AM Peter Zijlstra <peterz@infradead.org> wrote:
> > >
> > > On Sat, Jul 01, 2023 at 10:57:07PM -0400, guoren@kernel.org wrote:
> > > > From: Guo Ren <guoren@linux.alibaba.com>
> > > >
> > > > The irqentry_nmi_enter/exit would force the current context into in_interrupt.
> > > > That would trigger the kernel to dead panic, but the kdb still needs "ebreak" to
> > > > debug the kernel.
> > > >
> > > > Move irqentry_nmi_enter/exit to exception_enter/exit could correct handle_break
> > > > of the kernel side.
> > >
> > > This doesn't explain much if anything :/
> > >
> > > I'm confused (probably because I don't know RISC-V very well), what's
> > > EBREAK and how does it happen?
> > EBREAK is just an instruction of riscv which would rise breakpoint exception.
> >
> >
> > >
> > > Specifically, if EBREAK can happen inside an local_irq_disable() region,
> > > then the below change is actively wrong. Any exception/interrupt that
> > > can happen while local_irq_disable() must be treated like an NMI.
> > When the ebreak happend out of local_irq_disable region, but
> > __nmi_enter forces handle_break() into in_interupt() state. So how
>
> And why is that a problem? I think I'm missing something fundamental
> here...
The irqentry_nmi_enter() would force the current context to get
in_interrupt=true, although ebreak happens in the context which is
in_interrupt=false.
A lot of checking codes, such as:
if (in_interrupt())
panic("Fatal exception in interrupt");
It would make the kernel panic, but we don't panic; we want back to the shell.
eg:
echo BUG > /sys/kernel/debug/provoke-crash/DIRECT
>
> > about:
> >
> > diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
> > index f910dfccbf5d..69f7043a98b9 100644
> > --- a/arch/riscv/kernel/traps.c
> > +++ b/arch/riscv/kernel/traps.c
> > @@ -18,6 +18,7 @@
> > #include <linux/irq.h>
> > #include <linux/kexec.h>
> > #include <linux/entry-common.h>
> > +#include <linux/context_tracking.h>
> >
> > #include <asm/asm-prototypes.h>
> > #include <asm/bug.h>
> > @@ -285,12 +286,18 @@ asmlinkage __visible __trap_section void
> > do_trap_break(struct pt_regs *regs)
> > handle_break(regs);
> >
> > irqentry_exit_to_user_mode(regs);
> > - } else {
> > + } else if (in_interrupt()){
> > irqentry_state_t state = irqentry_nmi_enter(regs);
> >
> > handle_break(regs);
> >
> > irqentry_nmi_exit(regs, state);
> > + } else {
> > + enum ctx_state prev_state = exception_enter();
> > +
> > + handle_break(regs);
> > +
> > + exception_exit(prev_state);
> > }
> > }
>
> That's wrong. If you want to make it conditional, you have to look at
> !(regs->status & SR_IE) (that's the interrupt enable flag of the
> interrupted context, right?)
>
> When you hit an EBREAK when IRQs were disabled, you must be NMI like.
>
> But making it conditional like this makes it really hard to write a
> handler though, it basically must assume it will be NMI contetx (because
> it can't know) so there is no point in sometimes not doing NMI context.
--
Best Regards
Guo Ren
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCH] riscv: entry: Fixup do_trap_break from kernel side
2023-07-16 23:33 ` Guo Ren
@ 2023-07-17 10:45 ` Peter Zijlstra
-1 siblings, 0 replies; 20+ messages in thread
From: Peter Zijlstra @ 2023-07-17 10:45 UTC (permalink / raw)
To: Guo Ren
Cc: arnd, palmer, tglx, luto, conor.dooley, heiko, jszhang,
lazyparser, falcon, chenhuacai, apatel, atishp, mark.rutland,
bjorn, palmer, bjorn, daniel.thompson, linux-arch, linux-kernel,
linux-riscv, stable, Guo Ren
On Mon, Jul 17, 2023 at 07:33:25AM +0800, Guo Ren wrote:
> On Mon, Jul 10, 2023 at 4:02 PM Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Sun, Jul 09, 2023 at 10:30:22AM +0800, Guo Ren wrote:
> > > On Wed, Jul 5, 2023 at 12:40 AM Peter Zijlstra <peterz@infradead.org> wrote:
> > > >
> > > > On Sat, Jul 01, 2023 at 10:57:07PM -0400, guoren@kernel.org wrote:
> > > > > From: Guo Ren <guoren@linux.alibaba.com>
> > > > >
> > > > > The irqentry_nmi_enter/exit would force the current context into in_interrupt.
> > > > > That would trigger the kernel to dead panic, but the kdb still needs "ebreak" to
> > > > > debug the kernel.
> > > > >
> > > > > Move irqentry_nmi_enter/exit to exception_enter/exit could correct handle_break
> > > > > of the kernel side.
> > > >
> > > > This doesn't explain much if anything :/
> > > >
> > > > I'm confused (probably because I don't know RISC-V very well), what's
> > > > EBREAK and how does it happen?
> > > EBREAK is just an instruction of riscv which would rise breakpoint exception.
> > >
> > >
> > > >
> > > > Specifically, if EBREAK can happen inside an local_irq_disable() region,
> > > > then the below change is actively wrong. Any exception/interrupt that
> > > > can happen while local_irq_disable() must be treated like an NMI.
> > > When the ebreak happend out of local_irq_disable region, but
> > > __nmi_enter forces handle_break() into in_interupt() state. So how
> >
> > And why is that a problem? I think I'm missing something fundamental
> > here...
> The irqentry_nmi_enter() would force the current context to get
> in_interrupt=true, although ebreak happens in the context which is
> in_interrupt=false.
> A lot of checking codes, such as:
> if (in_interrupt())
> panic("Fatal exception in interrupt");
Why would you do that?!?
Are you're trying to differentiate between an exception and an
interrupt?
You *could* have ebreak in an interrupt, right? So why panic the machine
if that happens?
> It would make the kernel panic, but we don't panic; we want back to the shell.
> eg:
> echo BUG > /sys/kernel/debug/provoke-crash/DIRECT
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCH] riscv: entry: Fixup do_trap_break from kernel side
@ 2023-07-17 10:45 ` Peter Zijlstra
0 siblings, 0 replies; 20+ messages in thread
From: Peter Zijlstra @ 2023-07-17 10:45 UTC (permalink / raw)
To: Guo Ren
Cc: arnd, palmer, tglx, luto, conor.dooley, heiko, jszhang,
lazyparser, falcon, chenhuacai, apatel, atishp, mark.rutland,
bjorn, palmer, bjorn, daniel.thompson, linux-arch, linux-kernel,
linux-riscv, stable, Guo Ren
On Mon, Jul 17, 2023 at 07:33:25AM +0800, Guo Ren wrote:
> On Mon, Jul 10, 2023 at 4:02 PM Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Sun, Jul 09, 2023 at 10:30:22AM +0800, Guo Ren wrote:
> > > On Wed, Jul 5, 2023 at 12:40 AM Peter Zijlstra <peterz@infradead.org> wrote:
> > > >
> > > > On Sat, Jul 01, 2023 at 10:57:07PM -0400, guoren@kernel.org wrote:
> > > > > From: Guo Ren <guoren@linux.alibaba.com>
> > > > >
> > > > > The irqentry_nmi_enter/exit would force the current context into in_interrupt.
> > > > > That would trigger the kernel to dead panic, but the kdb still needs "ebreak" to
> > > > > debug the kernel.
> > > > >
> > > > > Move irqentry_nmi_enter/exit to exception_enter/exit could correct handle_break
> > > > > of the kernel side.
> > > >
> > > > This doesn't explain much if anything :/
> > > >
> > > > I'm confused (probably because I don't know RISC-V very well), what's
> > > > EBREAK and how does it happen?
> > > EBREAK is just an instruction of riscv which would rise breakpoint exception.
> > >
> > >
> > > >
> > > > Specifically, if EBREAK can happen inside an local_irq_disable() region,
> > > > then the below change is actively wrong. Any exception/interrupt that
> > > > can happen while local_irq_disable() must be treated like an NMI.
> > > When the ebreak happend out of local_irq_disable region, but
> > > __nmi_enter forces handle_break() into in_interupt() state. So how
> >
> > And why is that a problem? I think I'm missing something fundamental
> > here...
> The irqentry_nmi_enter() would force the current context to get
> in_interrupt=true, although ebreak happens in the context which is
> in_interrupt=false.
> A lot of checking codes, such as:
> if (in_interrupt())
> panic("Fatal exception in interrupt");
Why would you do that?!?
Are you're trying to differentiate between an exception and an
interrupt?
You *could* have ebreak in an interrupt, right? So why panic the machine
if that happens?
> It would make the kernel panic, but we don't panic; we want back to the shell.
> eg:
> echo BUG > /sys/kernel/debug/provoke-crash/DIRECT
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCH] riscv: entry: Fixup do_trap_break from kernel side
2023-07-17 10:45 ` Peter Zijlstra
@ 2023-07-17 16:14 ` Guo Ren
-1 siblings, 0 replies; 20+ messages in thread
From: Guo Ren @ 2023-07-17 16:14 UTC (permalink / raw)
To: Peter Zijlstra
Cc: arnd, palmer, tglx, luto, conor.dooley, heiko, jszhang,
lazyparser, falcon, chenhuacai, apatel, atishp, mark.rutland,
bjorn, palmer, bjorn, daniel.thompson, linux-arch, linux-kernel,
linux-riscv, stable, Guo Ren
On Mon, Jul 17, 2023 at 6:45 PM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Mon, Jul 17, 2023 at 07:33:25AM +0800, Guo Ren wrote:
> > On Mon, Jul 10, 2023 at 4:02 PM Peter Zijlstra <peterz@infradead.org> wrote:
> > >
> > > On Sun, Jul 09, 2023 at 10:30:22AM +0800, Guo Ren wrote:
> > > > On Wed, Jul 5, 2023 at 12:40 AM Peter Zijlstra <peterz@infradead.org> wrote:
> > > > >
> > > > > On Sat, Jul 01, 2023 at 10:57:07PM -0400, guoren@kernel.org wrote:
> > > > > > From: Guo Ren <guoren@linux.alibaba.com>
> > > > > >
> > > > > > The irqentry_nmi_enter/exit would force the current context into in_interrupt.
> > > > > > That would trigger the kernel to dead panic, but the kdb still needs "ebreak" to
> > > > > > debug the kernel.
> > > > > >
> > > > > > Move irqentry_nmi_enter/exit to exception_enter/exit could correct handle_break
> > > > > > of the kernel side.
> > > > >
> > > > > This doesn't explain much if anything :/
> > > > >
> > > > > I'm confused (probably because I don't know RISC-V very well), what's
> > > > > EBREAK and how does it happen?
> > > > EBREAK is just an instruction of riscv which would rise breakpoint exception.
> > > >
> > > >
> > > > >
> > > > > Specifically, if EBREAK can happen inside an local_irq_disable() region,
> > > > > then the below change is actively wrong. Any exception/interrupt that
> > > > > can happen while local_irq_disable() must be treated like an NMI.
> > > > When the ebreak happend out of local_irq_disable region, but
> > > > __nmi_enter forces handle_break() into in_interupt() state. So how
> > >
> > > And why is that a problem? I think I'm missing something fundamental
> > > here...
> > The irqentry_nmi_enter() would force the current context to get
> > in_interrupt=true, although ebreak happens in the context which is
> > in_interrupt=false.
> > A lot of checking codes, such as:
> > if (in_interrupt())
> > panic("Fatal exception in interrupt");
>
> Why would you do that?!?
>
> Are you're trying to differentiate between an exception and an
> interrupt?
>
> You *could* have ebreak in an interrupt, right? So why panic the machine
> if that happens?
Do you mean the below patch? Yes, it could fix up.
diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
index f910dfccbf5d..92899db6696b 100644
--- a/arch/riscv/kernel/traps.c
+++ b/arch/riscv/kernel/traps.c
@@ -85,8 +85,6 @@ void die(struct pt_regs *regs, const char *str)
spin_unlock_irqrestore(&die_lock, flags);
oops_exit();
- if (in_interrupt())
- panic("Fatal exception in interrupt");
if (panic_on_oops)
panic("Fatal exception");
if (ret != NOTIFY_STOP)
diff --git a/kernel/exit.c b/kernel/exit.c
index edb50b4c9972..a46a1aef66ce 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -940,8 +940,6 @@ void __noreturn make_task_dead(int signr)
struct task_struct *tsk = current;
unsigned int limit;
- if (unlikely(in_interrupt()))
- panic("Aiee, killing interrupt handler!");
if (unlikely(!tsk->pid))
panic("Attempted to kill the idle task!");
But how does x86 deal with it without kernel/exit.c modifcation?
>
> > It would make the kernel panic, but we don't panic; we want back to the shell.
> > eg:
> > echo BUG > /sys/kernel/debug/provoke-crash/DIRECT
>
>
>
--
Best Regards
Guo Ren
^ permalink raw reply related [flat|nested] 20+ messages in thread* Re: [PATCH] riscv: entry: Fixup do_trap_break from kernel side
@ 2023-07-17 16:14 ` Guo Ren
0 siblings, 0 replies; 20+ messages in thread
From: Guo Ren @ 2023-07-17 16:14 UTC (permalink / raw)
To: Peter Zijlstra
Cc: arnd, palmer, tglx, luto, conor.dooley, heiko, jszhang,
lazyparser, falcon, chenhuacai, apatel, atishp, mark.rutland,
bjorn, palmer, bjorn, daniel.thompson, linux-arch, linux-kernel,
linux-riscv, stable, Guo Ren
On Mon, Jul 17, 2023 at 6:45 PM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Mon, Jul 17, 2023 at 07:33:25AM +0800, Guo Ren wrote:
> > On Mon, Jul 10, 2023 at 4:02 PM Peter Zijlstra <peterz@infradead.org> wrote:
> > >
> > > On Sun, Jul 09, 2023 at 10:30:22AM +0800, Guo Ren wrote:
> > > > On Wed, Jul 5, 2023 at 12:40 AM Peter Zijlstra <peterz@infradead.org> wrote:
> > > > >
> > > > > On Sat, Jul 01, 2023 at 10:57:07PM -0400, guoren@kernel.org wrote:
> > > > > > From: Guo Ren <guoren@linux.alibaba.com>
> > > > > >
> > > > > > The irqentry_nmi_enter/exit would force the current context into in_interrupt.
> > > > > > That would trigger the kernel to dead panic, but the kdb still needs "ebreak" to
> > > > > > debug the kernel.
> > > > > >
> > > > > > Move irqentry_nmi_enter/exit to exception_enter/exit could correct handle_break
> > > > > > of the kernel side.
> > > > >
> > > > > This doesn't explain much if anything :/
> > > > >
> > > > > I'm confused (probably because I don't know RISC-V very well), what's
> > > > > EBREAK and how does it happen?
> > > > EBREAK is just an instruction of riscv which would rise breakpoint exception.
> > > >
> > > >
> > > > >
> > > > > Specifically, if EBREAK can happen inside an local_irq_disable() region,
> > > > > then the below change is actively wrong. Any exception/interrupt that
> > > > > can happen while local_irq_disable() must be treated like an NMI.
> > > > When the ebreak happend out of local_irq_disable region, but
> > > > __nmi_enter forces handle_break() into in_interupt() state. So how
> > >
> > > And why is that a problem? I think I'm missing something fundamental
> > > here...
> > The irqentry_nmi_enter() would force the current context to get
> > in_interrupt=true, although ebreak happens in the context which is
> > in_interrupt=false.
> > A lot of checking codes, such as:
> > if (in_interrupt())
> > panic("Fatal exception in interrupt");
>
> Why would you do that?!?
>
> Are you're trying to differentiate between an exception and an
> interrupt?
>
> You *could* have ebreak in an interrupt, right? So why panic the machine
> if that happens?
Do you mean the below patch? Yes, it could fix up.
diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
index f910dfccbf5d..92899db6696b 100644
--- a/arch/riscv/kernel/traps.c
+++ b/arch/riscv/kernel/traps.c
@@ -85,8 +85,6 @@ void die(struct pt_regs *regs, const char *str)
spin_unlock_irqrestore(&die_lock, flags);
oops_exit();
- if (in_interrupt())
- panic("Fatal exception in interrupt");
if (panic_on_oops)
panic("Fatal exception");
if (ret != NOTIFY_STOP)
diff --git a/kernel/exit.c b/kernel/exit.c
index edb50b4c9972..a46a1aef66ce 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -940,8 +940,6 @@ void __noreturn make_task_dead(int signr)
struct task_struct *tsk = current;
unsigned int limit;
- if (unlikely(in_interrupt()))
- panic("Aiee, killing interrupt handler!");
if (unlikely(!tsk->pid))
panic("Attempted to kill the idle task!");
But how does x86 deal with it without kernel/exit.c modifcation?
>
> > It would make the kernel panic, but we don't panic; we want back to the shell.
> > eg:
> > echo BUG > /sys/kernel/debug/provoke-crash/DIRECT
>
>
>
--
Best Regards
Guo Ren
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply related [flat|nested] 20+ messages in thread