All of lore.kernel.org
 help / color / mirror / Atom feed
From: Catalin Marinas <catalin.marinas@arm.com>
To: Mark Rutland <mark.rutland@arm.com>
Cc: Chen Jun <chenjun102@huawei.com>,
	linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, akpm@linux-foundation.org,
	will@kernel.org, rui.xiang@huawei.com,
	Mark Brown <broonie@kernel.org>
Subject: Re: [PATCH 2/2] arm64: stacktrace: Add skip when task == current
Date: Thu, 18 Mar 2021 16:17:24 +0000	[thread overview]
Message-ID: <20210318161723.GA10758@arm.com> (raw)
In-Reply-To: <20210317193416.GB9786@C02TD0UTHF1T.local>

On Wed, Mar 17, 2021 at 07:34:16PM +0000, Mark Rutland wrote:
> On Wed, Mar 17, 2021 at 06:36:36PM +0000, Catalin Marinas wrote:
> > On Wed, Mar 17, 2021 at 02:20:50PM +0000, Chen Jun wrote:
> > > On ARM64, cat /sys/kernel/debug/page_owner, all pages return the same
> > > stack:
> > >  stack_trace_save+0x4c/0x78
> > >  register_early_stack+0x34/0x70
> > >  init_page_owner+0x34/0x230
> > >  page_ext_init+0x1bc/0x1dc
> > > 
> > > The reason is that:
> > > check_recursive_alloc always return 1 because that
> > > entries[0] is always equal to ip (__set_page_owner+0x3c/0x60).
> > > 
> > > The root cause is that:
> > > commit 5fc57df2f6fd ("arm64: stacktrace: Convert to ARCH_STACKWALK")
> > > make the save_trace save 2 more entries.
> > > 
> > > Add skip in arch_stack_walk when task == current.
> > > 
> > > Fixes: 5fc57df2f6fd ("arm64: stacktrace: Convert to ARCH_STACKWALK")
> > > Signed-off-by: Chen Jun <chenjun102@huawei.com>
> > > ---
> > >  arch/arm64/kernel/stacktrace.c | 5 +++--
> > >  1 file changed, 3 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
> > > index ad20981..c26b0ac 100644
> > > --- a/arch/arm64/kernel/stacktrace.c
> > > +++ b/arch/arm64/kernel/stacktrace.c
> > > @@ -201,11 +201,12 @@ void arch_stack_walk(stack_trace_consume_fn consume_entry, void *cookie,
> > >  
> > >  	if (regs)
> > >  		start_backtrace(&frame, regs->regs[29], regs->pc);
> > > -	else if (task == current)
> > > +	else if (task == current) {
> > > +		((struct stacktrace_cookie *)cookie)->skip += 2;
> > >  		start_backtrace(&frame,
> > >  				(unsigned long)__builtin_frame_address(0),
> > >  				(unsigned long)arch_stack_walk);
> > > -	else
> > > +	} else
> > >  		start_backtrace(&frame, thread_saved_fp(task),
> > >  				thread_saved_pc(task));
> > 
> > I don't like abusing the cookie here. It's void * as it's meant to be an
> > opaque type. I'd rather skip the first two frames in walk_stackframe()
> > instead before invoking fn().
> 
> I agree that we shouldn't touch cookie here.
> 
> I don't think that it's right to bodge this inside walk_stackframe(),
> since that'll add bogus skipping for the case starting with regs in the
> current task. If we need a bodge, it has to live in arch_stack_walk()
> where we set up the initial unwinding state.

Good point. However, instead of relying on __builtin_frame_address(1),
can we add a 'skip' value to struct stackframe via arch_stack_walk() ->
start_backtrace() that is consumed by walk_stackframe()?

> In another thread, we came to the conclusion that arch_stack_walk()
> should start at its parent, and its parent should add any skipping it
> requires.

This makes sense.

> Currently, arch_stack_walk() is off-by-one, and we can bodge that by
> using __builtin_frame_address(1), though I'm waiting for some compiler
> folk to confirm that's sound. Otherwise we need to add an assembly
> trampoline to snapshot the FP, which is unfortunastely convoluted.
> 
> This report suggests that a caller of arch_stack_walk() is off-by-one
> too, which suggests a larger cross-architecture semantic issue. I'll try
> to take a look tomorrow.

I don't think the caller is off by one, at least not by the final skip
value. __set_page_owner() wants the trace to start at its caller. The
callee save_stack() in the same file adds a skip of 2.
save_stack_trace() increments the skip before invoking
arch_stack_walk(). So far, this assumes that arch_stack_walk() starts at
its parent, i.e. save_stack_trace().

So save_stack_trace() only need to skip 1 and I think that's in line
with the original report where the entries[0] is __set_page_owner(). We
only need to skip one. Another untested quick hack (we should probably
add the skip argument to start_backtrace()):

diff --git a/arch/arm64/include/asm/stacktrace.h b/arch/arm64/include/asm/stacktrace.h
index eb29b1fe8255..0d32d932ac89 100644
--- a/arch/arm64/include/asm/stacktrace.h
+++ b/arch/arm64/include/asm/stacktrace.h
@@ -56,6 +56,7 @@ struct stackframe {
 	DECLARE_BITMAP(stacks_done, __NR_STACK_TYPES);
 	unsigned long prev_fp;
 	enum stack_type prev_type;
+	int skip;
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 	int graph;
 #endif
@@ -153,6 +154,7 @@ static inline void start_backtrace(struct stackframe *frame,
 {
 	frame->fp = fp;
 	frame->pc = pc;
+	frame->skip = 0;
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 	frame->graph = 0;
 #endif
diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index ad20981dfda4..a89b2ecbf3de 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -118,7 +118,9 @@ void notrace walk_stackframe(struct task_struct *tsk, struct stackframe *frame,
 	while (1) {
 		int ret;
 
-		if (!fn(data, frame->pc))
+		if (frame->skip > 0)
+			frame->skip--;
+		else if (!fn(data, frame->pc))
 			break;
 		ret = unwind_frame(tsk, frame);
 		if (ret < 0)
@@ -201,11 +203,12 @@ void arch_stack_walk(stack_trace_consume_fn consume_entry, void *cookie,
 
 	if (regs)
 		start_backtrace(&frame, regs->regs[29], regs->pc);
-	else if (task == current)
+	else if (task == current) {
 		start_backtrace(&frame,
 				(unsigned long)__builtin_frame_address(0),
 				(unsigned long)arch_stack_walk);
-	else
+		frame.skip = 1;
+	} else
 		start_backtrace(&frame, thread_saved_fp(task),
 				thread_saved_pc(task));
 

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

WARNING: multiple messages have this Message-ID (diff)
From: Catalin Marinas <catalin.marinas@arm.com>
To: Mark Rutland <mark.rutland@arm.com>
Cc: Chen Jun <chenjun102@huawei.com>,
	linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, akpm@linux-foundation.org,
	will@kernel.org, rui.xiang@huawei.com,
	Mark Brown <broonie@kernel.org>
Subject: Re: [PATCH 2/2] arm64: stacktrace: Add skip when task == current
Date: Thu, 18 Mar 2021 16:17:24 +0000	[thread overview]
Message-ID: <20210318161723.GA10758@arm.com> (raw)
In-Reply-To: <20210317193416.GB9786@C02TD0UTHF1T.local>

On Wed, Mar 17, 2021 at 07:34:16PM +0000, Mark Rutland wrote:
> On Wed, Mar 17, 2021 at 06:36:36PM +0000, Catalin Marinas wrote:
> > On Wed, Mar 17, 2021 at 02:20:50PM +0000, Chen Jun wrote:
> > > On ARM64, cat /sys/kernel/debug/page_owner, all pages return the same
> > > stack:
> > >  stack_trace_save+0x4c/0x78
> > >  register_early_stack+0x34/0x70
> > >  init_page_owner+0x34/0x230
> > >  page_ext_init+0x1bc/0x1dc
> > > 
> > > The reason is that:
> > > check_recursive_alloc always return 1 because that
> > > entries[0] is always equal to ip (__set_page_owner+0x3c/0x60).
> > > 
> > > The root cause is that:
> > > commit 5fc57df2f6fd ("arm64: stacktrace: Convert to ARCH_STACKWALK")
> > > make the save_trace save 2 more entries.
> > > 
> > > Add skip in arch_stack_walk when task == current.
> > > 
> > > Fixes: 5fc57df2f6fd ("arm64: stacktrace: Convert to ARCH_STACKWALK")
> > > Signed-off-by: Chen Jun <chenjun102@huawei.com>
> > > ---
> > >  arch/arm64/kernel/stacktrace.c | 5 +++--
> > >  1 file changed, 3 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
> > > index ad20981..c26b0ac 100644
> > > --- a/arch/arm64/kernel/stacktrace.c
> > > +++ b/arch/arm64/kernel/stacktrace.c
> > > @@ -201,11 +201,12 @@ void arch_stack_walk(stack_trace_consume_fn consume_entry, void *cookie,
> > >  
> > >  	if (regs)
> > >  		start_backtrace(&frame, regs->regs[29], regs->pc);
> > > -	else if (task == current)
> > > +	else if (task == current) {
> > > +		((struct stacktrace_cookie *)cookie)->skip += 2;
> > >  		start_backtrace(&frame,
> > >  				(unsigned long)__builtin_frame_address(0),
> > >  				(unsigned long)arch_stack_walk);
> > > -	else
> > > +	} else
> > >  		start_backtrace(&frame, thread_saved_fp(task),
> > >  				thread_saved_pc(task));
> > 
> > I don't like abusing the cookie here. It's void * as it's meant to be an
> > opaque type. I'd rather skip the first two frames in walk_stackframe()
> > instead before invoking fn().
> 
> I agree that we shouldn't touch cookie here.
> 
> I don't think that it's right to bodge this inside walk_stackframe(),
> since that'll add bogus skipping for the case starting with regs in the
> current task. If we need a bodge, it has to live in arch_stack_walk()
> where we set up the initial unwinding state.

Good point. However, instead of relying on __builtin_frame_address(1),
can we add a 'skip' value to struct stackframe via arch_stack_walk() ->
start_backtrace() that is consumed by walk_stackframe()?

> In another thread, we came to the conclusion that arch_stack_walk()
> should start at its parent, and its parent should add any skipping it
> requires.

This makes sense.

> Currently, arch_stack_walk() is off-by-one, and we can bodge that by
> using __builtin_frame_address(1), though I'm waiting for some compiler
> folk to confirm that's sound. Otherwise we need to add an assembly
> trampoline to snapshot the FP, which is unfortunastely convoluted.
> 
> This report suggests that a caller of arch_stack_walk() is off-by-one
> too, which suggests a larger cross-architecture semantic issue. I'll try
> to take a look tomorrow.

I don't think the caller is off by one, at least not by the final skip
value. __set_page_owner() wants the trace to start at its caller. The
callee save_stack() in the same file adds a skip of 2.
save_stack_trace() increments the skip before invoking
arch_stack_walk(). So far, this assumes that arch_stack_walk() starts at
its parent, i.e. save_stack_trace().

So save_stack_trace() only need to skip 1 and I think that's in line
with the original report where the entries[0] is __set_page_owner(). We
only need to skip one. Another untested quick hack (we should probably
add the skip argument to start_backtrace()):

diff --git a/arch/arm64/include/asm/stacktrace.h b/arch/arm64/include/asm/stacktrace.h
index eb29b1fe8255..0d32d932ac89 100644
--- a/arch/arm64/include/asm/stacktrace.h
+++ b/arch/arm64/include/asm/stacktrace.h
@@ -56,6 +56,7 @@ struct stackframe {
 	DECLARE_BITMAP(stacks_done, __NR_STACK_TYPES);
 	unsigned long prev_fp;
 	enum stack_type prev_type;
+	int skip;
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 	int graph;
 #endif
@@ -153,6 +154,7 @@ static inline void start_backtrace(struct stackframe *frame,
 {
 	frame->fp = fp;
 	frame->pc = pc;
+	frame->skip = 0;
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 	frame->graph = 0;
 #endif
diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index ad20981dfda4..a89b2ecbf3de 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -118,7 +118,9 @@ void notrace walk_stackframe(struct task_struct *tsk, struct stackframe *frame,
 	while (1) {
 		int ret;
 
-		if (!fn(data, frame->pc))
+		if (frame->skip > 0)
+			frame->skip--;
+		else if (!fn(data, frame->pc))
 			break;
 		ret = unwind_frame(tsk, frame);
 		if (ret < 0)
@@ -201,11 +203,12 @@ void arch_stack_walk(stack_trace_consume_fn consume_entry, void *cookie,
 
 	if (regs)
 		start_backtrace(&frame, regs->regs[29], regs->pc);
-	else if (task == current)
+	else if (task == current) {
 		start_backtrace(&frame,
 				(unsigned long)__builtin_frame_address(0),
 				(unsigned long)arch_stack_walk);
-	else
+		frame.skip = 1;
+	} else
 		start_backtrace(&frame, thread_saved_fp(task),
 				thread_saved_pc(task));
 

-- 
Catalin

  parent reply	other threads:[~2021-03-18 16:19 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-17 14:20 [PATCH 0/2] Fix page_owner broken on arm64 Chen Jun
2021-03-17 14:20 ` Chen Jun
2021-03-17 14:20 ` [PATCH 1/2] stacktrace: Move struct stacktrace_cookie to stacktrace.h Chen Jun
2021-03-17 14:20   ` Chen Jun
2021-03-17 14:20 ` [PATCH 2/2] arm64: stacktrace: Add skip when task == current Chen Jun
2021-03-17 14:20   ` Chen Jun
2021-03-17 18:36   ` Catalin Marinas
2021-03-17 18:36     ` Catalin Marinas
2021-03-17 19:34     ` Mark Rutland
2021-03-17 19:34       ` Mark Rutland
2021-03-18  3:24       ` chenjun (AM)
2021-03-18  3:24         ` chenjun (AM)
2021-03-18 13:22         ` chenjun (AM)
2021-03-18 13:22           ` chenjun (AM)
2021-03-18 16:17       ` Catalin Marinas [this message]
2021-03-18 16:17         ` Catalin Marinas
2021-03-18 17:12         ` Mark Rutland
2021-03-18 17:12           ` Mark Rutland
2021-03-18 18:36           ` Catalin Marinas
2021-03-18 18:36             ` Catalin Marinas
2021-03-17 22:23 ` [PATCH 0/2] Fix page_owner broken on arm64 Andrew Morton
2021-03-17 22:23   ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210318161723.GA10758@arm.com \
    --to=catalin.marinas@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=broonie@kernel.org \
    --cc=chenjun102@huawei.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=rui.xiang@huawei.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.