From: Joe Lawrence <joe.lawrence@redhat.com>
To: Nicolai Stange <nstange@suse.de>
Cc: Jiri Kosina <jkosina@suse.cz>,
linux-kernel@vger.kernel.org, Torsten Duwe <duwe@lst.de>,
live-patching@vger.kernel.org, linuxppc-dev@lists.ozlabs.org
Subject: Re: ppc64le reliable stack unwinder and scheduled tasks
Date: Mon, 14 Jan 2019 11:46:59 -0500 [thread overview]
Message-ID: <20190114164659.GA18643@redhat.com> (raw)
In-Reply-To: <877ef7bt6j.fsf@suse.de>
On Mon, Jan 14, 2019 at 08:21:40AM +0100, Nicolai Stange wrote:
> Joe Lawrence <joe.lawrence@redhat.com> writes:
>
> > We should be careful when inspecting the bottom-most stack frame (the
> > first to be unwound), particularly for scheduled-out tasks. As Nicolai
> > Stange explains, "If I'm reading the code in _switch() correctly, the
> > first frame is completely uninitialized except for the pointer back to
> > the caller's stack frame." If a previous do_IRQ() invocation, for
> > example, has left a residual exception-marker on the first frame, the
> > stack tracer would incorrectly report this task's trace as unreliable.
> >
>
> FWIW, it's not been do_IRQ() who wrote the exception marker, but it's
> caller hardware_interrupt_common(), more specifically the
> EXCEPTION_PROLOG_COMMON_3 part therof.
>
Hi Nicolai,
Yeah, I was sloppy with the description there. :)
> I thought about this a little more and can't see anything that would
> prevent higher, i.e. non-_switch() frames to also alias with prior
> exception frames. That STACK_FRAME_REGS_MARKER is written to a stack
> frame's "parameter area" and most functions probably don't initialize
> this either. So, AFAICS, higher stack frames could potentially be
> affected by the very same problem.
Hmm, I suppose a callee could leave that stack-word untouched and then
make subsquent calls, which would be confusing for the unwinder.
> I think the best solution would be to clear the STACK_FRAME_REGS_MARKER
> upon exception return. I have a patch ready for that and will post it
> after it has passed some basic testing -- hopefully later this day.
>
I agree that this seems like the simplest way to clean up the exception
stack frame state.
> That being said, I still think that your patch should also get applied
> in some form -- looking at unitialized memory is just not a good thing
> to do.
>
> [ ... snip ...]
> I would perhaps not limit this to the STACK_FRAME_REGS_MARKER, but also
> not emit the ip obtained from the first frame into the resulting trace.
>
> I.e., how about moving all the sp/newsp handling to the beginning of the
> loop and doing an 'if (firstframe) continue;' right after that?
Good point, there is a bunch of ip and trace entries bookkeeping that
shouldn't apply in this case.
I gave the following some very light testing (5.0.0-rc2 + Petr's atomic
patches as to include and run the selftests) ... if you want to take a
bigger hammer to refactor some of the sp/newsp code (perhaps it could be
incorporated into the for() loop itself), feel free to go for it. You
could add something like this as a 2nd patch to the previously mentioned
STACK_FRAME_REGS_MARKER cleanup fix.
Thanks,
-- Joe
-->8-- -->8-- -->8-- -->8-- -->8-- -->8-- -->8-- -->8-- -->8-- -->8--
From b87f9e81cf59a6e7e2309400e1b417562414cd5c Mon Sep 17 00:00:00 2001
From: Joe Lawrence <joe.lawrence@redhat.com>
Date: Sun, 13 Jan 2019 21:02:01 -0500
Subject: [PATCH] powerpc/livepatch: relax reliable stack tracer checks for
first-frame
The bottom-most stack frame (the first to be unwound) may be largely
uninitialized, for the "Power Architecture 64-Bit ELF V2 ABI" only
requires its backchain pointer to be set.
The reliable stack tracer should be careful when verifying this frame:
skip checks on STACK_FRAME_LR_SAVE and STACK_FRAME_MARKER offsets that
may contain uninitialized residual data.
Fixes: df78d3f61480 ("powerpc/livepatch: Implement reliable stack tracing for the consistency model")
Suggested-by: Nicolai Stange <nstange@suse.de>
Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com>
---
arch/powerpc/kernel/stacktrace.c | 33 +++++++++++++++++++++-----------
1 file changed, 22 insertions(+), 11 deletions(-)
diff --git a/arch/powerpc/kernel/stacktrace.c b/arch/powerpc/kernel/stacktrace.c
index e2c50b55138f..46096687a5a8 100644
--- a/arch/powerpc/kernel/stacktrace.c
+++ b/arch/powerpc/kernel/stacktrace.c
@@ -84,6 +84,12 @@ save_stack_trace_regs(struct pt_regs *regs, struct stack_trace *trace)
EXPORT_SYMBOL_GPL(save_stack_trace_regs);
#ifdef CONFIG_HAVE_RELIABLE_STACKTRACE
+/*
+ * This function returns an error if it detects any unreliable features of the
+ * stack. Otherwise it guarantees that the stack trace is reliable.
+ *
+ * If the task is not 'current', the caller *must* ensure the task is inactive.
+ */
int
save_stack_trace_tsk_reliable(struct task_struct *tsk,
struct stack_trace *trace)
@@ -142,12 +148,6 @@ save_stack_trace_tsk_reliable(struct task_struct *tsk,
if (sp & 0xF)
return 1;
- /* Mark stacktraces with exception frames as unreliable. */
- if (sp <= stack_end - STACK_INT_FRAME_SIZE &&
- stack[STACK_FRAME_MARKER] == STACK_FRAME_REGS_MARKER) {
- return 1;
- }
-
newsp = stack[0];
/* Stack grows downwards; unwinder may only go up. */
if (newsp <= sp)
@@ -158,11 +158,21 @@ save_stack_trace_tsk_reliable(struct task_struct *tsk,
return 1; /* invalid backlink, too far up. */
}
+ /* We can only trust the bottom frame's backlink, the rest
+ * of the frame may be uninitialized, continue to the next. */
+ if (firstframe--)
+ goto next;
+
+ /* Mark stacktraces with exception frames as unreliable. */
+ if (sp <= stack_end - STACK_INT_FRAME_SIZE &&
+ stack[STACK_FRAME_MARKER] == STACK_FRAME_REGS_MARKER) {
+ return 1;
+ }
+
/* Examine the saved LR: it must point into kernel code. */
ip = stack[STACK_FRAME_LR_SAVE];
- if (!firstframe && !__kernel_text_address(ip))
+ if (!__kernel_text_address(ip))
return 1;
- firstframe = 0;
/*
* FIXME: IMHO these tests do not belong in
@@ -183,12 +193,13 @@ save_stack_trace_tsk_reliable(struct task_struct *tsk,
else
trace->skip--;
- if (newsp == stack_end)
- break;
-
if (trace->nr_entries >= trace->max_entries)
return -E2BIG;
+next:
+ if (newsp == stack_end)
+ break;
+
sp = newsp;
}
return 0;
--
2.20.1
WARNING: multiple messages have this Message-ID (diff)
From: Joe Lawrence <joe.lawrence@redhat.com>
To: Nicolai Stange <nstange@suse.de>
Cc: linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
live-patching@vger.kernel.org, Torsten Duwe <duwe@lst.de>,
Michael Ellerman <mpe@ellerman.id.au>,
Jiri Kosina <jkosina@suse.cz>,
Balbir Singh <bsingharora@gmail.com>
Subject: Re: ppc64le reliable stack unwinder and scheduled tasks
Date: Mon, 14 Jan 2019 11:46:59 -0500 [thread overview]
Message-ID: <20190114164659.GA18643@redhat.com> (raw)
In-Reply-To: <877ef7bt6j.fsf@suse.de>
On Mon, Jan 14, 2019 at 08:21:40AM +0100, Nicolai Stange wrote:
> Joe Lawrence <joe.lawrence@redhat.com> writes:
>
> > We should be careful when inspecting the bottom-most stack frame (the
> > first to be unwound), particularly for scheduled-out tasks. As Nicolai
> > Stange explains, "If I'm reading the code in _switch() correctly, the
> > first frame is completely uninitialized except for the pointer back to
> > the caller's stack frame." If a previous do_IRQ() invocation, for
> > example, has left a residual exception-marker on the first frame, the
> > stack tracer would incorrectly report this task's trace as unreliable.
> >
>
> FWIW, it's not been do_IRQ() who wrote the exception marker, but it's
> caller hardware_interrupt_common(), more specifically the
> EXCEPTION_PROLOG_COMMON_3 part therof.
>
Hi Nicolai,
Yeah, I was sloppy with the description there. :)
> I thought about this a little more and can't see anything that would
> prevent higher, i.e. non-_switch() frames to also alias with prior
> exception frames. That STACK_FRAME_REGS_MARKER is written to a stack
> frame's "parameter area" and most functions probably don't initialize
> this either. So, AFAICS, higher stack frames could potentially be
> affected by the very same problem.
Hmm, I suppose a callee could leave that stack-word untouched and then
make subsquent calls, which would be confusing for the unwinder.
> I think the best solution would be to clear the STACK_FRAME_REGS_MARKER
> upon exception return. I have a patch ready for that and will post it
> after it has passed some basic testing -- hopefully later this day.
>
I agree that this seems like the simplest way to clean up the exception
stack frame state.
> That being said, I still think that your patch should also get applied
> in some form -- looking at unitialized memory is just not a good thing
> to do.
>
> [ ... snip ...]
> I would perhaps not limit this to the STACK_FRAME_REGS_MARKER, but also
> not emit the ip obtained from the first frame into the resulting trace.
>
> I.e., how about moving all the sp/newsp handling to the beginning of the
> loop and doing an 'if (firstframe) continue;' right after that?
Good point, there is a bunch of ip and trace entries bookkeeping that
shouldn't apply in this case.
I gave the following some very light testing (5.0.0-rc2 + Petr's atomic
patches as to include and run the selftests) ... if you want to take a
bigger hammer to refactor some of the sp/newsp code (perhaps it could be
incorporated into the for() loop itself), feel free to go for it. You
could add something like this as a 2nd patch to the previously mentioned
STACK_FRAME_REGS_MARKER cleanup fix.
Thanks,
-- Joe
-->8-- -->8-- -->8-- -->8-- -->8-- -->8-- -->8-- -->8-- -->8-- -->8--
From b87f9e81cf59a6e7e2309400e1b417562414cd5c Mon Sep 17 00:00:00 2001
From: Joe Lawrence <joe.lawrence@redhat.com>
Date: Sun, 13 Jan 2019 21:02:01 -0500
Subject: [PATCH] powerpc/livepatch: relax reliable stack tracer checks for
first-frame
The bottom-most stack frame (the first to be unwound) may be largely
uninitialized, for the "Power Architecture 64-Bit ELF V2 ABI" only
requires its backchain pointer to be set.
The reliable stack tracer should be careful when verifying this frame:
skip checks on STACK_FRAME_LR_SAVE and STACK_FRAME_MARKER offsets that
may contain uninitialized residual data.
Fixes: df78d3f61480 ("powerpc/livepatch: Implement reliable stack tracing for the consistency model")
Suggested-by: Nicolai Stange <nstange@suse.de>
Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com>
---
arch/powerpc/kernel/stacktrace.c | 33 +++++++++++++++++++++-----------
1 file changed, 22 insertions(+), 11 deletions(-)
diff --git a/arch/powerpc/kernel/stacktrace.c b/arch/powerpc/kernel/stacktrace.c
index e2c50b55138f..46096687a5a8 100644
--- a/arch/powerpc/kernel/stacktrace.c
+++ b/arch/powerpc/kernel/stacktrace.c
@@ -84,6 +84,12 @@ save_stack_trace_regs(struct pt_regs *regs, struct stack_trace *trace)
EXPORT_SYMBOL_GPL(save_stack_trace_regs);
#ifdef CONFIG_HAVE_RELIABLE_STACKTRACE
+/*
+ * This function returns an error if it detects any unreliable features of the
+ * stack. Otherwise it guarantees that the stack trace is reliable.
+ *
+ * If the task is not 'current', the caller *must* ensure the task is inactive.
+ */
int
save_stack_trace_tsk_reliable(struct task_struct *tsk,
struct stack_trace *trace)
@@ -142,12 +148,6 @@ save_stack_trace_tsk_reliable(struct task_struct *tsk,
if (sp & 0xF)
return 1;
- /* Mark stacktraces with exception frames as unreliable. */
- if (sp <= stack_end - STACK_INT_FRAME_SIZE &&
- stack[STACK_FRAME_MARKER] == STACK_FRAME_REGS_MARKER) {
- return 1;
- }
-
newsp = stack[0];
/* Stack grows downwards; unwinder may only go up. */
if (newsp <= sp)
@@ -158,11 +158,21 @@ save_stack_trace_tsk_reliable(struct task_struct *tsk,
return 1; /* invalid backlink, too far up. */
}
+ /* We can only trust the bottom frame's backlink, the rest
+ * of the frame may be uninitialized, continue to the next. */
+ if (firstframe--)
+ goto next;
+
+ /* Mark stacktraces with exception frames as unreliable. */
+ if (sp <= stack_end - STACK_INT_FRAME_SIZE &&
+ stack[STACK_FRAME_MARKER] == STACK_FRAME_REGS_MARKER) {
+ return 1;
+ }
+
/* Examine the saved LR: it must point into kernel code. */
ip = stack[STACK_FRAME_LR_SAVE];
- if (!firstframe && !__kernel_text_address(ip))
+ if (!__kernel_text_address(ip))
return 1;
- firstframe = 0;
/*
* FIXME: IMHO these tests do not belong in
@@ -183,12 +193,13 @@ save_stack_trace_tsk_reliable(struct task_struct *tsk,
else
trace->skip--;
- if (newsp == stack_end)
- break;
-
if (trace->nr_entries >= trace->max_entries)
return -E2BIG;
+next:
+ if (newsp == stack_end)
+ break;
+
sp = newsp;
}
return 0;
--
2.20.1
next prev parent reply other threads:[~2019-01-14 16:48 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-01-10 21:14 ppc64le reliable stack unwinder and scheduled tasks Joe Lawrence
2019-01-10 21:14 ` Joe Lawrence
2019-01-11 0:00 ` Nicolai Stange
2019-01-11 0:00 ` Nicolai Stange
2019-01-11 1:08 ` Joe Lawrence
2019-01-11 1:08 ` Joe Lawrence
2019-01-11 7:51 ` Nicolai Stange
2019-01-11 7:51 ` Nicolai Stange
2019-01-14 4:09 ` Joe Lawrence
2019-01-14 4:09 ` Joe Lawrence
2019-01-14 7:21 ` Nicolai Stange
2019-01-14 7:21 ` Nicolai Stange
2019-01-14 16:46 ` Joe Lawrence [this message]
2019-01-14 16:46 ` Joe Lawrence
2019-01-14 17:09 ` Josh Poimboeuf
2019-01-14 17:09 ` Josh Poimboeuf
2019-01-14 17:54 ` Joe Lawrence
2019-01-14 17:54 ` Joe Lawrence
2019-01-12 1:09 ` Balbir Singh
2019-01-12 1:09 ` Balbir Singh
2019-01-12 8:45 ` Segher Boessenkool
2019-01-13 12:33 ` Balbir Singh
2019-01-13 13:05 ` Torsten Duwe
2019-01-13 13:05 ` Torsten Duwe
2019-01-17 14:52 ` Segher Boessenkool
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190114164659.GA18643@redhat.com \
--to=joe.lawrence@redhat.com \
--cc=duwe@lst.de \
--cc=jkosina@suse.cz \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=live-patching@vger.kernel.org \
--cc=nstange@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.