public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@elte.hu>
To: Andi Kleen <andi@firstfloor.org>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	linux-kernel@vger.kernel.org, jkosina@suse.cz,
	zdenek.kabelac@gmail.com, Arjan van de Ven <arjan@infradead.org>
Subject: Re: [PATCH REPOST^3] Run IST traps from user mode preemptive on process stack
Date: Tue, 6 May 2008 16:34:00 +0200	[thread overview]
Message-ID: <20080506143400.GA26162@elte.hu> (raw)
In-Reply-To: <87skwvbn4m.fsf@basil.nowhere.org>


* Andi Kleen <andi@firstfloor.org> wrote:

> > This introduces two security bugs in one go:
> >
> > 1.) The IST stack pointer is elevated unconditionaly and with your 
> > change we can now schedule away in the handler.
> 
> Yes, but that's fine.

no, your patch is not fine at all. It introduces a security hole, as 
Thomas pointed it out to you.

The IST pointer in the _TSS_ can go out of bounds by your patch! That 
corruption can be controlled by malicious userspace. Read the analysis 
from Thomas.

That percpu_tss.ist[] location is not just a random pointer. It is the 
TSS that is read by the CPU on every trap. It is a security-critical 
structure and every modification to it has to be treated very, very 
carefully. If it goes out of bounds that leads to very nasty problems.

This is a pretty bad security bug, and your reply shows ignorance about 
the code your patch is modifying - _after_ Thomas has pointed it out to 
you.

> > this same CPU triggers the same trap and elevates it again.
> 
> That's not possible generally. None of these interrupts can nest in a 
> normal kernel.

read the analysis from Thomas. Here is a sample buggy sequence:

  task #1 runs, does int3, goes on DEBUG_STACK IST stack,
      we do -= 4096 of the tss.ist[DEBUG_STACK-1] pointer, handler schedules
  task #2 runs, does int3, goes on DEBUG_STACK IST stack, handler schedules
      we do -= 4096 of the tss.ist[DEBUG_STACK-1] pointer, handler schedules
  task #3 runs, does int3, goes on DEBUG_STACK IST

  => poof, stack underflow, memory corruption of nearby data structures,
     because the DEBUG_STACK is only 8KB...

this can be triggered in almost arbitrary depth, from user-space.

> Do you refer to the DEBUG_STACK ist add/dec? That is not needed for 
> anything in tree to my knowledge.

Wrong. The pointer the subq/addq instructions modify are in fact used by 
_EVERY SINGLE_ IST trap sequence that the 64-bit kernel executes (!).

This is the modification that the paranoidentry macro does to the TSS in 
entry_64.S:

  subq $EXCEPTION_STKSZ, per_cpu__init_tss + TSS_ist + (\ist - 1) * 8(%rbp)
  ...
  addq $EXCEPTION_STKSZ, per_cpu__init_tss + TSS_ist + (\ist - 1) * 8(%rbp)

that modifies the TSS _directly_. The TSS is directly read by the CPU on 
every trap entry. The percpu_tss.ist[] array of pointers is then used by 
the CPU for every single IST trap. (hw-debug, NMI, int3, double-fault, 
stacksegment-overflow, MCE)

This is nothing new or unexpected, it is basic x86-64 IST architectural 
behavior implemented in the CPU.

The bug you introduce is that if the handler schedules away (it blocks 
or gets preempted on CONFIG_PREEMPT), it would keep the per-CPU IST 
offset for that IST type decreased by 4096 => if done enough times the 
pointer goes out of bounds and it's kaboom.

> > 2.) The IST stack pointer is unbalanced in the other direction as 
> > well: it is incremented on CPUn then the handler is scheduled away 
> > and migrated to CPUm. After return from the handler the IST stacks 
> > of both CPUs are corrupted. Exploitable by unprivileged user-space 
> > code as well.
> 
> The IST is restored again after the handler. [...]

yes but it is restored on the wrong CPU, after the task has scheduled 
and migrated - moving the IST pointer in the TSS out of bounds, so the 
next IST trap (which user-space can trigger arbitrarily - just generate 
a stream of INT3 breakpoints) will corrupt memory!

> [...] You're right that strictly wouldn't be needed and could be 
> avoided, but i don't think it's exploitable in any ways.
> 
> Regarding user controlled pt_regs: I think you're forgetting that 
> x86-64 doesn't have vm86 mode and that we can always trust pt_regs 
> segment indexes. On i386 you would be right, but not here.

Thomas is not forgetting anything.
 
Thomas is doing security analysis about how the hole in your patch can 
be exploited: the wrong IST offsets are used to corrupt nearby data 
structures - a malicious nonprivileged user task can modify a good 
portion of the ~64 bytes of pt_regs at the end of every page near the 
IST stack address (and randomly corrupt some bytes below that) - just by 
virtue of generating an int3 (or hw-debug) trap with prepared register 
contents.

You first need to understand the hole you are introducing to understand 
the finer aspects of his security analysis though...

	Ingo

  reply	other threads:[~2008-05-06 14:34 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-05-02  9:19 [PATCH REPOST^3] Run IST traps from user mode preemptive on process stack Andi Kleen
2008-05-06 12:31 ` Thomas Gleixner
2008-05-06 13:03   ` Andi Kleen
2008-05-06 14:34     ` Ingo Molnar [this message]
2008-05-06 14:39     ` Ingo Molnar
2008-05-06 14:41     ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080506143400.GA26162@elte.hu \
    --to=mingo@elte.hu \
    --cc=andi@firstfloor.org \
    --cc=arjan@infradead.org \
    --cc=jkosina@suse.cz \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=zdenek.kabelac@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox