All of lore.kernel.org
 help / color / mirror / Atom feed
From: Josh Poimboeuf <jpoimboe@redhat.com>
To: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
	x86@kernel.org, linux-kernel@vger.kernel.org,
	Michal Marek <mmarek@suse.cz>,
	Peter Zijlstra <peterz@infradead.org>,
	Andy Lutomirski <luto@kernel.org>, Borislav Petkov <bp@alien8.de>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andi Kleen <andi@firstfloor.org>, Pedro Alves <palves@redhat.com>,
	Namhyung Kim <namhyung@gmail.com>,
	Bernd Petrovitsch <bernd@petrovitsch.priv.at>,
	Chris J Arges <chris.j.arges@canonical.com>,
	live-patching@vger.kernel.org
Subject: Re: [PATCH v10 03/20] x86/stackvalidate: Compile-time stack validation
Date: Wed, 19 Aug 2015 23:00:50 -0500	[thread overview]
Message-ID: <20150820040050.GC2944@treble.redhat.com> (raw)
In-Reply-To: <20150819100138.GA10504@gmail.com>

On Wed, Aug 19, 2015 at 12:01:38PM +0200, Ingo Molnar wrote:
> 
> * Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> 
> > On Sat, Aug 15, 2015 at 12:23:54AM -0700, Andrew Morton wrote:
> > > On Thu, 13 Aug 2015 22:10:24 -0500 Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> > > 
> > > > This adds a CONFIG_STACK_VALIDATION option which enables a host tool
> > > > named stackvalidate which runs at compile time.
> > > 
> > > It would be useful to
> > > 
> > > - show example output for a typical error site
> > > 
> > > - describe the consequences of that error (ie: why should we fix
> > >   these things?)
> > > 
> > > - describe what needs to be done to repair these sites.
> > > 
> > > 
> > > IOW, what do we gain from merging all this stuff?
> > 
> > I attempted to do all that in Documentation/stack-validation.txt which
> > is in patch 03/20.  Does it address your concerns?  Here it is:
> 
> I think it answers the first and third question, but not the second one:
> 
>     - describe the consequences of that error (ie: why should we fix
>       these things?)
> 
> would be nice to document all that richly.

Ok.  I've tried to answer the "why" question from both a broad
perspective ("why do we need stackvalidate?") as well as for each of the
rules that it enforces.

---8<---

Subject: [PATCH] stackvalidate: Document why it's needed

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
---
 Documentation/stack-validation.txt | 122 +++++++++++++++++++++++++++++++++++--
 1 file changed, 118 insertions(+), 4 deletions(-)

diff --git a/Documentation/stack-validation.txt b/Documentation/stack-validation.txt
index c3d3d35..94dad40 100644
--- a/Documentation/stack-validation.txt
+++ b/Documentation/stack-validation.txt
@@ -24,6 +24,101 @@ instructions).  Similarly, it knows how to follow switch statements, for
 which gcc sometimes uses jump tables.
 
 
+Why do we need stack validation?
+--------------------------------
+
+Here are some of the benefits of validating stack metadata:
+
+a) More reliable stack traces for frame pointer enabled kernels
+
+   Frame pointers are used for debugging purposes.  They allow runtime
+   code and debug tools to be able to walk the stack to determine the
+   chain of function call sites that led to the currently executing
+   code.
+
+   For some architectures, frame pointers are enabled by
+   CONFIG_FRAME_POINTER.  For some other architectures they may be
+   required by the ABI (sometimes referred to as "backchain pointers").
+
+   For C code, gcc automatically generates instructions for setting up
+   frame pointers when the -fno-omit-frame-pointer option is used.
+
+   But for asm code, the frame setup instructions have to be written by
+   hand, which most people don't do.  So the end result is that
+   CONFIG_FRAME_POINTER is honored for C code but not for most asm code.
+
+   For stack traces based on frame pointers to be reliable, all
+   functions which call other functions must first create a stack frame
+   and update the frame pointer.  If a first function doesn't properly
+   create a stack frame before calling a second function, the *caller*
+   of the first function will be skipped on the stack trace.
+
+   The benefit of stackvalidate here is that it ensures that *all*
+   functions honor CONFIG_FRAME_POINTER.  As a result, no functions will
+   ever [*] be skipped on a stack trace.
+
+   [*] unless an interrupt or exception has occurred at the very
+       beginning of a function before the stack frame has been created,
+       or at the very end of the function after the stack frame has been
+       destroyed.  This is an inherent limitation of frame pointers.
+
+b) 100% reliable stack traces for DWARF enabled kernels
+
+   (NOTE: This is not yet implemented)
+
+   As an alternative to frame pointers, DWARF Call Frame Information
+   (CFI) metadata can be used to walk the stack.  Unlike frame pointers,
+   CFI metadata is out of band.  So it doesn't affect runtime
+   performance and it can be reliable even when interrupts or exceptions
+   are involved.
+
+   For C code, gcc automatically generates DWARF CFI metadata.  But for
+   asm code, generating CFI is a tedious manual approach which requires
+   manually placed .cfi assembler macros to be scattered throughout the
+   code.  It's clumsy and very easy to get wrong, and it makes the real
+   code harder to read.
+
+   Stackvalidate will improve this situation in several ways.  For code
+   which already has CFI annotations, it will validate them.  For code
+   which doesn't have CFI annotations, it will generate them.  So an
+   architecture can opt to strip out all the manual .cfi annotations
+   from their asm code and have stackvalidate generate them instead.
+
+   We might also add a runtime stack validation debug option where we
+   periodically walk the stack from schedule() and/or an NMI to ensure
+   that the stack metadata is sane and that we reach the bottom of the
+   stack.
+
+   So the benefit of stackvalidate here will be that external tooling
+   should always show perfect stack traces.  And the same will be true
+   for kernel warning/oops traces if the architecture has a runtime
+   DWARF unwinder.
+
+c) Higher live patching compatibility rate
+
+   (NOTE: This is not yet implemented)
+
+   Currently with CONFIG_LIVEPATCH there's a basic live patching
+   framework which is safe for roughly 85-90% of "security" fixes.  But
+   patches can't have complex features like function dependency or
+   prototype changes, or data structure changes.
+
+   There's a strong need to support patches which have the more complex
+   features so that the patch compatibility rate for security fixes can
+   eventually approach something resembling 100%.  To achieve that, a
+   "consistency model" is needed, which allows tasks to be safely
+   transitioned from an unpatched state to a patched state.
+
+   One of the key requirements of the currently proposed livepatch
+   consistency model [*] is that it needs to walk the stack of each
+   sleeping task to determine if it can be transitioned to the patched
+   state.  If stackvalidate can ensure that stack traces are reliable,
+   this consistency model can be used and the live patching
+   compatibility rate can be improved significantly.
+
+   [*] https://lkml.kernel.org/r/cover.1423499826.git.jpoimboe@redhat.com
+
+
 Rules
 -----
 
@@ -35,15 +130,26 @@ To achieve the validation, stackvalidate enforces the following rules:
    outside of a function, it flags an error since that usually indicates
    callable code which should be annotated accordingly.
 
+   This rule is needed so that stackvalidate can properly identify each
+   callable function in order to analyze its stack metadata.
+
 2. Conversely, each section of code which is *not* callable should *not*
    be annotated as an ELF function.  The ENDPROC macro shouldn't be used
    in this case.
 
+   This rule is needed so that stackvalidate can ignore non-callable
+   code.  Such code doesn't have to follow any of the other rules.
+
 3. Each callable function which calls another function must have the
    correct frame pointer logic, if required by CONFIG_FRAME_POINTER or
    the architecture's back chain rules.  This can by done in asm code
    with the FRAME_BEGIN/FRAME_END macros.
 
+   This rule ensures that frame pointer based stack traces will work as
+   designed.  If function A doesn't create a stack frame before calling
+   function B, the _caller_ of function A will be skipped on the stack
+   trace.
+
 4. Dynamic jumps and jumps to undefined symbols are only allowed if:
 
    a) the jump is part of a switch statement; or
@@ -51,10 +157,18 @@ To achieve the validation, stackvalidate enforces the following rules:
    b) the jump matches sibling call semantics and the frame pointer has
       the same value it had on function entry.
 
+   This rule is needed so that stackvalidate can reliably analyze all of
+   a function's code paths.  If a function jumps to code in another
+   file, and it's not a sibling call, stackvalidate has no way to follow
+   the jump because it only analyzes a single file at a time.
+
 5. A callable function may not execute kernel entry/exit instructions.
    The only code which needs such instructions is kernel entry code,
    which shouldn't be be in callable functions anyway.
 
+   This rule is just a sanity check to ensure that callable functions
+   return normally.
+
 
 Errors in .S files
 ------------------
@@ -63,8 +177,8 @@ If you're getting an error in a compiled .S file which you don't
 understand, first make sure that the affected code follows the above
 rules.
 
-Here are some examples of common problems and suggestions for how to fix
-them.
+Here are some examples of common warnings reported by stackvalidate,
+what they mean, and suggestions for how to fix them.
 
 
 1. stackvalidate: asm_file.o: func()+0x128: call without frame pointer save/setup
@@ -73,8 +187,8 @@ them.
    updating the frame pointer.
 
    If func() is indeed a callable function, add proper frame pointer
-   logic using the FP_SAVE and FP_RESTORE macros.  Otherwise, remove its
-   ELF function annotation by changing ENDPROC to END.
+   logic using the FRAME_BEGIN and FRAME_END macros.  Otherwise, remove
+   its ELF function annotation by changing ENDPROC to END.
 
    If you're getting this error in a .c file, see the "Errors in .c
    files" section.
-- 
2.4.3


  reply	other threads:[~2015-08-20  4:00 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-08-14  3:10 [PATCH v10 00/20] Compile-time stack validation Josh Poimboeuf
2015-08-14  3:10 ` [PATCH v10 01/20] x86/asm: Frame pointer macro cleanup Josh Poimboeuf
2015-08-14  3:10 ` [PATCH v10 02/20] x86/asm: Add C versions of frame pointer macros Josh Poimboeuf
2015-08-14  3:10 ` [PATCH v10 03/20] x86/stackvalidate: Compile-time stack validation Josh Poimboeuf
2015-08-15  7:23   ` Andrew Morton
2015-08-15 12:49     ` Josh Poimboeuf
2015-08-19 10:01       ` Ingo Molnar
2015-08-20  4:00         ` Josh Poimboeuf [this message]
2015-08-21  7:54           ` Ingo Molnar
2015-08-21 13:32             ` Josh Poimboeuf
2015-08-22  9:17               ` Ingo Molnar
2015-08-14  3:10 ` [PATCH v10 04/20] x86/stackvalidate: Add file and directory ignores Josh Poimboeuf
2015-08-14  3:10 ` [PATCH v10 05/20] x86/stackvalidate: Add ignore macros Josh Poimboeuf
2015-08-14  3:10 ` [PATCH v10 06/20] x86/xen: Add stack frame dependency to hypercall inline asm calls Josh Poimboeuf
2015-08-14  3:10 ` [PATCH v10 07/20] x86/paravirt: Add stack frame dependency to PVOP " Josh Poimboeuf
2015-08-14  3:10 ` [PATCH v10 08/20] x86/paravirt: Create a stack frame in PV_CALLEE_SAVE_REGS_THUNK Josh Poimboeuf
2015-08-14  3:10 ` [PATCH v10 09/20] x86/amd: Set ELF function type for vide() Josh Poimboeuf
2015-08-14  3:10 ` [PATCH v10 10/20] x86/reboot: Add ljmp instructions to stackvalidate whitelist Josh Poimboeuf
2015-08-14  3:10 ` [PATCH v10 11/20] x86/xen: Add xen_cpuid() and xen_setup_gdt() to stackvalidate whitelists Josh Poimboeuf
2015-08-14  3:10 ` [PATCH v10 12/20] x86/asm/crypto: Create stack frames in aesni-intel_asm.S Josh Poimboeuf
2015-08-14  3:10 ` [PATCH v10 13/20] x86/asm/crypto: Move .Lbswap_mask data to .rodata section Josh Poimboeuf
2015-08-14  3:10 ` [PATCH v10 14/20] x86/asm/crypto: Move jump_table " Josh Poimboeuf
2015-08-14  3:10 ` [PATCH v10 15/20] x86/asm/crypto: Create stack frames in clmul_ghash_mul/update() Josh Poimboeuf
2015-08-14  3:10 ` [PATCH v10 16/20] x86/asm/entry: Create stack frames in thunk functions Josh Poimboeuf
2015-08-14  3:10 ` [PATCH v10 17/20] x86/asm/acpi: Create a stack frame in do_suspend_lowlevel() Josh Poimboeuf
2015-08-14  3:10 ` [PATCH v10 18/20] x86/asm: Create stack frames in rwsem functions Josh Poimboeuf
2015-08-14  3:10 ` [PATCH v10 19/20] x86/asm/efi: Create a stack frame in efi_call() Josh Poimboeuf
2015-08-14  9:11   ` Matt Fleming
2015-08-14 14:07     ` Josh Poimboeuf
2015-08-14  3:10 ` [PATCH v10 20/20] x86/asm/power: Create stack frames in hibernate_asm_64.S Josh Poimboeuf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150820040050.GC2944@treble.redhat.com \
    --to=jpoimboe@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=andi@firstfloor.org \
    --cc=bernd@petrovitsch.priv.at \
    --cc=bp@alien8.de \
    --cc=chris.j.arges@canonical.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=live-patching@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@kernel.org \
    --cc=mingo@redhat.com \
    --cc=mmarek@suse.cz \
    --cc=namhyung@gmail.com \
    --cc=palves@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.