linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Josh Poimboeuf <jpoimboe@redhat.com>
To: Andy Lutomirski <luto@amacapital.net>
Cc: Jiri Kosina <jikos@kernel.org>, Ingo Molnar <mingo@redhat.com>,
	X86 ML <x86@kernel.org>,
	Heiko Carstens <heiko.carstens@de.ibm.com>,
	"linux-s390@vger.kernel.org" <linux-s390@vger.kernel.org>,
	live-patching@vger.kernel.org,
	Michael Ellerman <mpe@ellerman.id.au>,
	Chris J Arges <chris.j.arges@canonical.com>,
	linuxppc-dev@lists.ozlabs.org, Jessica Yu <jeyu@redhat.com>,
	Petr Mladek <pmladek@suse.com>, Jiri Slaby <jslaby@suse.cz>,
	Vojtech Pavlik <vojtech@suse.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Miroslav Benes <mbenes@suse.cz>,
	Peter Zijlstra <peterz@infradead.org>
Subject: Re: [RFC PATCH v2 05/18] sched: add task flag for preempt IRQ tracking
Date: Mon, 2 May 2016 14:44:44 -0500	[thread overview]
Message-ID: <20160502194444.warei6nshg3uvrvx@treble> (raw)
In-Reply-To: <CALCETrUwHc5G37hoNKou3nBO+5RjenOJqXYR4yGaj9yCZwXbZA@mail.gmail.com>

On Mon, May 02, 2016 at 11:12:39AM -0700, Andy Lutomirski wrote:
> On Mon, May 2, 2016 at 10:31 AM, Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> > On Mon, May 02, 2016 at 08:52:41AM -0700, Andy Lutomirski wrote:
> >> On Mon, May 2, 2016 at 6:52 AM, Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> >> > On Fri, Apr 29, 2016 at 05:08:50PM -0700, Andy Lutomirski wrote:
> >> >> On Apr 29, 2016 3:41 PM, "Josh Poimboeuf" <jpoimboe@redhat.com> wrote:
> >> >> >
> >> >> > On Fri, Apr 29, 2016 at 02:37:41PM -0700, Andy Lutomirski wrote:
> >> >> > > On Fri, Apr 29, 2016 at 2:25 PM, Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> >> >> > > >> I suppose we could try to rejigger the code so that rbp points to
> >> >> > > >> pt_regs or similar.
> >> >> > > >
> >> >> > > > I think we should avoid doing something like that because it would break
> >> >> > > > gdb and all the other unwinders who don't know about it.
> >> >> > >
> >> >> > > How so?
> >> >> > >
> >> >> > > Currently, rbp in the entry code is meaningless.  I'm suggesting that,
> >> >> > > when we do, for example, 'call \do_sym' in idtentry, we point rbp to
> >> >> > > the pt_regs.  Currently it points to something stale (which the
> >> >> > > dump_stack code might be relying on.  Hmm.)  But it's probably also
> >> >> > > safe to assume that if you unwind to the 'call \do_sym', then pt_regs
> >> >> > > is the next thing on the stack, so just doing the section thing would
> >> >> > > work.
> >> >> >
> >> >> > Yes, rbp is meaningless on the entry from user space.  But if an
> >> >> > in-kernel interrupt occurs (e.g. page fault, preemption) and you have
> >> >> > nested entry, rbp keeps its old value, right?  So the unwinder can walk
> >> >> > past the nested entry frame and keep going until it gets to the original
> >> >> > entry.
> >> >>
> >> >> Yes.
> >> >>
> >> >> It would be nice if we could do better, though, and actually notice
> >> >> the pt_regs and identify the entry.  For example, I'd love to see
> >> >> "page fault, RIP=xyz" printed in the middle of a stack dump on a
> >> >> crash.
> >> >>
> >> >> Also, I think that just following rbp links will lose the
> >> >> actual function that took the page fault (or whatever function
> >> >> pt_regs->ip actually points to).
> >> >
> >> > Hm.  I think we could fix all that in a more standard way.  Whenever a
> >> > new pt_regs frame gets saved on entry, we could also create a new stack
> >> > frame which points to a fake kernel_entry() function.  That would tell
> >> > the unwinder there's a pt_regs frame without otherwise breaking frame
> >> > pointers across the frame.
> >> >
> >> > Then I guess we wouldn't need my other solution of putting the idt
> >> > entries in a special section.
> >> >
> >> > How does that sound?
> >>
> >> Let me try to understand.
> >>
> >> The normal call sequence is call; push %rbp; mov %rsp, %rbp.  So rbp
> >> points to (prev rbp, prev rip) on the stack, and you can follow the
> >> chain back.  Right now, on a user access page fault or similar, we
> >> have rbp (probably) pointing to the interrupted frame, and the
> >> interrupted rip isn't saved anywhere that a naive unwinder can find
> >> it.  (It's in pt_regs, but the rbp chain skips right over that.)
> >>
> >> We could change the entry code so that an interrupt / idtentry does:
> >>
> >> push pt_regs
> >> push kernel_entry
> >> push %rbp
> >> mov %rsp, %rbp
> >> call handler
> >> pop %rbp
> >> addq $8, %rsp
> >>
> >> or similar.  That would make it appear that the actual C handler was
> >> caused by a dummy function "kernel_entry".  Now the unwinder would get
> >> to kernel_entry, but it *still* wouldn't find its way to the calling
> >> frame, which only solves part of the problem.  We could at least teach
> >> the unwinder how kernel_entry works and let it decode pt_regs to
> >> continue unwinding.  This would be nice, and I think it could work.
> >
> > Yeah, that's about what I had in mind.
> 
> FWIW, I just tried this:
> 
> static bool is_entry_text(unsigned long addr)
> {
>     return addr >= (unsigned long)__entry_text_start &&
>         addr < (unsigned long)__entry_text_end;
> }
> 
> it works.  So the entry code is already annotated reasonably well :)
> 
> I just hacked it up here:
> 
> https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commit/?h=stack&id=085eacfe0edfc18768e48340084415dba9a6bd21
> 
> and it seems to work, at least for page faults.  A better
> implementation would print out the entire contents of pt_regs so that
> people reading the stack trace will know the registers at the time of
> the exception, which might be helpful.

I still think we would need more specific annotations to do that
reliably: a call from entry code doesn't necessarily correlate with a
pt_regs frame.

> >> I think I like this, except that, if it used a separate section, it
> >> could potentially be faster, as, for each actual entry type, the
> >> offset from the C handler frame to pt_regs is a foregone conclusion.
> >
> > Hm, this I don't really follow.  It's true that the unwinder can easily
> > find RIP from pt_regs, which will always be a known offset from the
> > kernel_entry pointer on the stack.  But why would having the entry code
> > in a separate section make that faster?
> 
> It doesn't make the unwinder faster -- it makes the entry code faster.

Oh, right.  But I don't think a few extra frame pointer instructions are
much of an issue if you already have CONFIG_FRAME_POINTER enabled.

Anyway I'm not sure which way is better.  I'll think about it.

> I hope your plans include rewriting the current stack unwinder
> completely.  The thing in print_context_stack is (a)
> hard-to-understand and hard-to-modify crap and (b) is called in a loop
> from another file using totally ridiculous conventions.

I agree, that code is quite confusing.  I haven't really thought about
how specifically it could be improved or replaced though.

Along those lines, I think it would be awesome if we could have an
arch-independent DWARF unwinder so that most of the stack dumping code
could be shared amongst all the arches.

-- 
Josh

  parent reply	other threads:[~2016-05-02 19:44 UTC|newest]

Thread overview: 118+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-28 20:44 [RFC PATCH v2 00/18] livepatch: hybrid consistency model Josh Poimboeuf
2016-04-28 20:44 ` [RFC PATCH v2 01/18] x86/asm/head: clean up initial stack variable Josh Poimboeuf
2016-04-28 20:44 ` [RFC PATCH v2 02/18] x86/asm/head: use a common function for starting CPUs Josh Poimboeuf
2016-04-28 20:44 ` [RFC PATCH v2 03/18] x86/asm/head: standardize the bottom of the stack for idle tasks Josh Poimboeuf
2016-04-29 18:46   ` Brian Gerst
2016-04-29 20:28     ` Josh Poimboeuf
2016-04-29 19:39   ` Andy Lutomirski
2016-04-29 20:50     ` Josh Poimboeuf
2016-04-29 21:38       ` Andy Lutomirski
2016-04-29 23:27         ` Josh Poimboeuf
2016-04-30  0:10           ` Andy Lutomirski
2016-04-28 20:44 ` [RFC PATCH v2 04/18] x86: move _stext marker before head code Josh Poimboeuf
2016-04-28 20:44 ` [RFC PATCH v2 05/18] sched: add task flag for preempt IRQ tracking Josh Poimboeuf
2016-04-29 18:06   ` Andy Lutomirski
2016-04-29 20:11     ` Josh Poimboeuf
2016-04-29 20:19       ` Andy Lutomirski
2016-04-29 20:27         ` Josh Poimboeuf
2016-04-29 20:32           ` Andy Lutomirski
2016-04-29 21:25             ` Josh Poimboeuf
2016-04-29 21:37               ` Andy Lutomirski
2016-04-29 22:11                 ` Jiri Kosina
2016-04-29 22:57                   ` Josh Poimboeuf
2016-04-30  0:09                   ` Andy Lutomirski
2016-04-29 22:41                 ` Josh Poimboeuf
2016-04-30  0:08                   ` Andy Lutomirski
2016-05-02 13:52                     ` Josh Poimboeuf
2016-05-02 15:52                       ` Andy Lutomirski
2016-05-02 17:31                         ` Josh Poimboeuf
2016-05-02 18:12                           ` Andy Lutomirski
2016-05-02 18:34                             ` Ingo Molnar
2016-05-02 19:44                             ` Josh Poimboeuf [this message]
2016-05-02 19:54                             ` Jiri Kosina
2016-05-02 20:00                               ` Jiri Kosina
2016-05-03  0:39                                 ` Andy Lutomirski
2016-05-04 15:16                             ` David Laight
2016-05-19 23:15                         ` Josh Poimboeuf
2016-05-19 23:39                           ` Andy Lutomirski
2016-05-20 14:05                             ` Josh Poimboeuf
2016-05-20 15:41                               ` Andy Lutomirski
2016-05-20 16:41                                 ` Josh Poimboeuf
2016-05-20 16:59                                   ` Andy Lutomirski
2016-05-20 17:49                                     ` Josh Poimboeuf
2016-05-23 23:02                                     ` Jiri Kosina
2016-05-24  1:42                                       ` Andy Lutomirski
2016-05-23 21:34                           ` Andy Lutomirski
2016-05-24  2:28                             ` Josh Poimboeuf
2016-05-24  3:52                               ` Andy Lutomirski
2016-06-22 16:30                                 ` Josh Poimboeuf
2016-06-22 17:59                                   ` Andy Lutomirski
2016-06-22 18:22                                     ` Josh Poimboeuf
2016-06-22 18:26                                       ` Andy Lutomirski
2016-06-22 18:40                                         ` Josh Poimboeuf
2016-06-22 19:17                                           ` Andy Lutomirski
2016-06-23 16:19                                             ` Josh Poimboeuf
2016-06-23 16:35                                               ` Andy Lutomirski
2016-06-23 18:31                                                 ` Josh Poimboeuf
2016-06-23 20:40                                                   ` Josh Poimboeuf
2016-06-23 22:00                                                     ` Andy Lutomirski
2016-06-23  0:09                                   ` Andy Lutomirski
2016-06-23 15:55                                     ` Josh Poimboeuf
2016-04-28 20:44 ` [RFC PATCH v2 06/18] x86: dump_trace() error handling Josh Poimboeuf
2016-04-29 13:45   ` Minfei Huang
2016-04-29 14:00     ` Josh Poimboeuf
2016-04-28 20:44 ` [RFC PATCH v2 07/18] stacktrace/x86: function for detecting reliable stack traces Josh Poimboeuf
2016-04-28 20:44 ` [RFC PATCH v2 08/18] livepatch: temporary stubs for klp_patch_pending() and klp_patch_task() Josh Poimboeuf
2016-04-28 20:44 ` [RFC PATCH v2 09/18] livepatch/x86: add TIF_PATCH_PENDING thread flag Josh Poimboeuf
2016-04-29 18:08   ` Andy Lutomirski
2016-04-29 20:18     ` Josh Poimboeuf
2016-04-28 20:44 ` [RFC PATCH v2 10/18] livepatch/powerpc: " Josh Poimboeuf
2016-05-03  9:07   ` Petr Mladek
2016-05-03 12:06     ` Miroslav Benes
2016-04-28 20:44 ` [RFC PATCH v2 11/18] livepatch/s390: reorganize TIF thread flag bits Josh Poimboeuf
2016-04-28 20:44 ` [RFC PATCH v2 12/18] livepatch/s390: add TIF_PATCH_PENDING thread flag Josh Poimboeuf
2016-04-28 20:44 ` [RFC PATCH v2 13/18] livepatch: separate enabled and patched states Josh Poimboeuf
2016-05-03  9:30   ` Petr Mladek
2016-05-03 13:48     ` Josh Poimboeuf
2016-04-28 20:44 ` [RFC PATCH v2 14/18] livepatch: remove unnecessary object loaded check Josh Poimboeuf
2016-04-28 20:44 ` [RFC PATCH v2 15/18] livepatch: move patching functions into patch.c Josh Poimboeuf
2016-05-03  9:39   ` Petr Mladek
2016-04-28 20:44 ` [RFC PATCH v2 16/18] livepatch: store function sizes Josh Poimboeuf
2016-04-28 20:44 ` [RFC PATCH v2 17/18] livepatch: change to a per-task consistency model Josh Poimboeuf
2016-05-04  8:42   ` Petr Mladek
2016-05-04 15:51     ` Josh Poimboeuf
2016-05-05  9:41       ` Miroslav Benes
2016-05-05 13:06       ` Petr Mladek
2016-05-04 12:39   ` barriers: was: " Petr Mladek
2016-05-04 13:53     ` Peter Zijlstra
2016-05-04 16:51       ` Josh Poimboeuf
2016-05-04 14:12     ` Petr Mladek
2016-05-04 17:25       ` Josh Poimboeuf
2016-05-05 11:21         ` Petr Mladek
2016-05-09 15:42         ` Miroslav Benes
2016-05-04 17:02     ` Josh Poimboeuf
2016-05-05 10:21       ` Petr Mladek
2016-05-04 14:48   ` klp_task_patch: " Petr Mladek
2016-05-04 14:56     ` Jiri Kosina
2016-05-04 17:57     ` Josh Poimboeuf
2016-05-05 11:57       ` Petr Mladek
2016-05-06 12:38         ` Josh Poimboeuf
2016-05-09 12:23           ` Petr Mladek
2016-05-16 18:12             ` Josh Poimboeuf
2016-05-18 13:12               ` Petr Mladek
2016-05-06 11:33   ` Petr Mladek
2016-05-06 12:44     ` Josh Poimboeuf
2016-05-09  9:41   ` Miroslav Benes
2016-05-16 17:27     ` Josh Poimboeuf
2016-05-10 11:39   ` Miroslav Benes
2016-05-17 22:53   ` Jessica Yu
2016-05-18  8:16     ` Jiri Kosina
2016-05-18 16:51       ` Josh Poimboeuf
2016-05-18 20:22         ` Jiri Kosina
2016-05-23  9:42           ` David Laight
2016-05-23 18:44             ` Jiri Kosina
2016-05-24 15:06               ` David Laight
2016-05-24 22:45                 ` Jiri Kosina
2016-06-06 13:54   ` [RFC PATCH v2 17/18] " Petr Mladek
2016-06-06 14:29     ` Josh Poimboeuf
2016-04-28 20:44 ` [RFC PATCH v2 18/18] livepatch: add /proc/<pid>/patch_state Josh Poimboeuf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160502194444.warei6nshg3uvrvx@treble \
    --to=jpoimboe@redhat.com \
    --cc=chris.j.arges@canonical.com \
    --cc=heiko.carstens@de.ibm.com \
    --cc=jeyu@redhat.com \
    --cc=jikos@kernel.org \
    --cc=jslaby@suse.cz \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=live-patching@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=mbenes@suse.cz \
    --cc=mingo@redhat.com \
    --cc=mpe@ellerman.id.au \
    --cc=peterz@infradead.org \
    --cc=pmladek@suse.com \
    --cc=vojtech@suse.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).