linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Joe Lawrence <joe.lawrence@redhat.com>
To: Nicolai Stange <nstange@suse.de>
Cc: Jiri Kosina <jkosina@suse.cz>,
	linux-kernel@vger.kernel.org, Torsten Duwe <duwe@lst.de>,
	live-patching@vger.kernel.org, linuxppc-dev@lists.ozlabs.org
Subject: Re: ppc64le reliable stack unwinder and scheduled tasks
Date: Mon, 14 Jan 2019 11:46:59 -0500	[thread overview]
Message-ID: <20190114164659.GA18643@redhat.com> (raw)
In-Reply-To: <877ef7bt6j.fsf@suse.de>

On Mon, Jan 14, 2019 at 08:21:40AM +0100, Nicolai Stange wrote:
> Joe Lawrence <joe.lawrence@redhat.com> writes:
> 
> > We should be careful when inspecting the bottom-most stack frame (the
> > first to be unwound), particularly for scheduled-out tasks.  As Nicolai
> > Stange explains, "If I'm reading the code in _switch() correctly, the
> > first frame is completely uninitialized except for the pointer back to
> > the caller's stack frame."  If a previous do_IRQ() invocation, for
> > example, has left a residual exception-marker on the first frame, the
> > stack tracer would incorrectly report this task's trace as unreliable.
> >
> 
> FWIW, it's not been do_IRQ() who wrote the exception marker, but it's
> caller hardware_interrupt_common(), more specifically the
> EXCEPTION_PROLOG_COMMON_3 part therof.
> 

Hi Nicolai,

Yeah, I was sloppy with the description there. :)

> I thought about this a little more and can't see anything that would
> prevent higher, i.e. non-_switch() frames to also alias with prior
> exception frames. That STACK_FRAME_REGS_MARKER is written to a stack
> frame's "parameter area" and most functions probably don't initialize
> this either. So, AFAICS, higher stack frames could potentially be
> affected by the very same problem.

Hmm, I suppose a callee could leave that stack-word untouched and then
make subsquent calls, which would be confusing for the unwinder.

> I think the best solution would be to clear the STACK_FRAME_REGS_MARKER
> upon exception return. I have a patch ready for that and will post it
> after it has passed some basic testing -- hopefully later this day.
> 

I agree that this seems like the simplest way to clean up the exception
stack frame state. 

> That being said, I still think that your patch should also get applied
> in some form -- looking at unitialized memory is just not a good thing
> to do.
> 
> [ ... snip ...]

> I would perhaps not limit this to the STACK_FRAME_REGS_MARKER, but also
> not emit the ip obtained from the first frame into the resulting trace.
> 
> I.e., how about moving all the sp/newsp handling to the beginning of the
> loop and doing an 'if (firstframe) continue;' right after that?

Good point, there is a bunch of ip and trace entries bookkeeping that
shouldn't apply in this case.

I gave the following some very light testing (5.0.0-rc2 + Petr's atomic
patches as to include and run the selftests) ... if you want to take a
bigger hammer to refactor some of the sp/newsp code (perhaps it could be
incorporated into the for() loop itself), feel free to go for it.  You
could add something like this as a 2nd patch to the previously mentioned
STACK_FRAME_REGS_MARKER cleanup fix.

Thanks,

-- Joe

-->8-- -->8-- -->8-- -->8-- -->8-- -->8-- -->8-- -->8-- -->8-- -->8--

From b87f9e81cf59a6e7e2309400e1b417562414cd5c Mon Sep 17 00:00:00 2001
From: Joe Lawrence <joe.lawrence@redhat.com>
Date: Sun, 13 Jan 2019 21:02:01 -0500
Subject: [PATCH] powerpc/livepatch: relax reliable stack tracer checks for
 first-frame

The bottom-most stack frame (the first to be unwound) may be largely
uninitialized, for the "Power Architecture 64-Bit ELF V2 ABI" only
requires its backchain pointer to be set.

The reliable stack tracer should be careful when verifying this frame:
skip checks on STACK_FRAME_LR_SAVE and STACK_FRAME_MARKER offsets that
may contain uninitialized residual data.

Fixes: df78d3f61480 ("powerpc/livepatch: Implement reliable stack tracing for the consistency model")
Suggested-by: Nicolai Stange <nstange@suse.de>
Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com>
---
 arch/powerpc/kernel/stacktrace.c | 33 +++++++++++++++++++++-----------
 1 file changed, 22 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/kernel/stacktrace.c b/arch/powerpc/kernel/stacktrace.c
index e2c50b55138f..46096687a5a8 100644
--- a/arch/powerpc/kernel/stacktrace.c
+++ b/arch/powerpc/kernel/stacktrace.c
@@ -84,6 +84,12 @@ save_stack_trace_regs(struct pt_regs *regs, struct stack_trace *trace)
 EXPORT_SYMBOL_GPL(save_stack_trace_regs);
 
 #ifdef CONFIG_HAVE_RELIABLE_STACKTRACE
+/*
+ * This function returns an error if it detects any unreliable features of the
+ * stack.  Otherwise it guarantees that the stack trace is reliable.
+ *
+ * If the task is not 'current', the caller *must* ensure the task is inactive.
+ */
 int
 save_stack_trace_tsk_reliable(struct task_struct *tsk,
 				struct stack_trace *trace)
@@ -142,12 +148,6 @@ save_stack_trace_tsk_reliable(struct task_struct *tsk,
 		if (sp & 0xF)
 			return 1;
 
-		/* Mark stacktraces with exception frames as unreliable. */
-		if (sp <= stack_end - STACK_INT_FRAME_SIZE &&
-		    stack[STACK_FRAME_MARKER] == STACK_FRAME_REGS_MARKER) {
-			return 1;
-		}
-
 		newsp = stack[0];
 		/* Stack grows downwards; unwinder may only go up. */
 		if (newsp <= sp)
@@ -158,11 +158,21 @@ save_stack_trace_tsk_reliable(struct task_struct *tsk,
 			return 1; /* invalid backlink, too far up. */
 		}
 
+		/* We can only trust the bottom frame's backlink, the rest
+		 * of the frame may be uninitialized, continue to the next. */
+		if (firstframe--)
+			goto next;
+
+		/* Mark stacktraces with exception frames as unreliable. */
+		if (sp <= stack_end - STACK_INT_FRAME_SIZE &&
+		    stack[STACK_FRAME_MARKER] == STACK_FRAME_REGS_MARKER) {
+			return 1;
+		}
+
 		/* Examine the saved LR: it must point into kernel code. */
 		ip = stack[STACK_FRAME_LR_SAVE];
-		if (!firstframe && !__kernel_text_address(ip))
+		if (!__kernel_text_address(ip))
 			return 1;
-		firstframe = 0;
 
 		/*
 		 * FIXME: IMHO these tests do not belong in
@@ -183,12 +193,13 @@ save_stack_trace_tsk_reliable(struct task_struct *tsk,
 		else
 			trace->skip--;
 
-		if (newsp == stack_end)
-			break;
-
 		if (trace->nr_entries >= trace->max_entries)
 			return -E2BIG;
 
+next:
+		if (newsp == stack_end)
+			break;
+
 		sp = newsp;
 	}
 	return 0;
-- 
2.20.1



  reply	other threads:[~2019-01-14 16:48 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-10 21:14 ppc64le reliable stack unwinder and scheduled tasks Joe Lawrence
2019-01-11  0:00 ` Nicolai Stange
2019-01-11  1:08   ` Joe Lawrence
2019-01-11  7:51     ` Nicolai Stange
2019-01-14  4:09       ` Joe Lawrence
2019-01-14  7:21         ` Nicolai Stange
2019-01-14 16:46           ` Joe Lawrence [this message]
2019-01-14 17:09             ` Josh Poimboeuf
2019-01-14 17:54               ` Joe Lawrence
2019-01-12  1:09 ` Balbir Singh
2019-01-12  8:45   ` Segher Boessenkool
2019-01-13 12:33     ` Balbir Singh
2019-01-13 13:05       ` Torsten Duwe
2019-01-17 14:52       ` Segher Boessenkool

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190114164659.GA18643@redhat.com \
    --to=joe.lawrence@redhat.com \
    --cc=duwe@lst.de \
    --cc=jkosina@suse.cz \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=live-patching@vger.kernel.org \
    --cc=nstange@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).