All of lore.kernel.org
 help / color / mirror / Atom feed
From: Denys Vlasenko <dvlasenk@redhat.com>
To: Ingo Molnar <mingo@kernel.org>
Cc: Brian Gerst <brgerst@gmail.com>,
	Andy Lutomirski <luto@kernel.org>,
	the arch/x86 maintainers <x86@kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Borislav Petkov <bp@suse.de>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Borislav Petkov <bp@alien8.de>
Subject: Re: [PATCH urgent v2] x86, asm: Disable opportunistic SYSRET if regs->flags has TF set
Date: Thu, 02 Apr 2015 14:59:22 +0200	[thread overview]
Message-ID: <551D3D2A.2040802@redhat.com> (raw)
In-Reply-To: <20150402123159.GA25151@gmail.com>

On 04/02/2015 02:31 PM, Ingo Molnar wrote:
> 
> * Denys Vlasenko <dvlasenk@redhat.com> wrote:
> 
>> On 04/02/2015 01:14 PM, Brian Gerst wrote:
>>>>>> So I merged this as it's an obvious bugfix, but in hindsight I'm
>>>>>> really uneasy about the whole opportunistic SYSRET concept: it appears
>>>>>> that the chance that %rcx matches return-%rip is astronomical - this
>>>>>> is why this bug wasn't noticed live so far.
>>>>>>
>>>>>> So should we really be doing this?
>>>>>
>>>>> Andy does this not for the off-chance that userspace's RCX is equal
>>>>> to return address and R11 == RFLAGS. The chances of that are
>>>>> astronomically small.
>>>>>
>>>>> This code path triggers when ptrace/audit/seccomp is active. Instead
>>>>> of torturing ourselves trying to not divert into IRET return, now
>>>>> code is steered that way. But then immediately before actual IRET,
>>>>> we check again: "do we really need IRET?" IOW "did ptrace really
>>>>> touch pt_regs->ss? ->flags? ->rip? ->rcx?" which in vast majority of
>>>>> cases will not be true.
>>>>
>>>> I keep forgetting about that, my test systems have the audit muck
>>>> turned off ;-)
>>>>
>>>> Fair enough - and it's sensible to share the IRET path between
>>>> interrupts and complex-return system calls, even though the check
>>>> is unnecessary overhead for the pure interrupt return path...
>>>
>>>
>>> Maybe we could reintroduce TIF_IRET for this purpose instead of
>>> (ab)using TIF_NOTIFY_RESUME.  Then we would only do the opportunistic
>>> check for those cases (ptrace, audit, exec, sigreturn, etc.), and skip
>>> it for interrupts.
>>
>> The very first check in the existing code, pt_regs->cx == 
>> pt_regs->ip, will fail for interrupt returns.
>>
>> You hardly can save anything by placing a (ti->flags & 
>> TIF_TRY_SYSRET) check in front of it, it's almost as expensive.
> 
> Well, what I was thinking of was to have a pure irq (well, async 
> context) return path, not shared with the weird-syscall-IRET return 
> path at all ...
> 
> It would be open coded, not obfuscated via macros.
> 
> That way AFAICS the upsides are:
> 
>   - it's easier to read (and maintain) what goes on in which case.
>     '*intr*' labels would truly identify interrupt return related 
>     processing, for a change!

Re labels: I fully agree they need cleanup (mass rename).

Something along the lines of

int_ret_from_sys_call -> return_from_syscall
int_with_check   -> sysret_check_workmask_in_edi
int_careful      -> sysret_check_NEED_RESCHED
int_very_careful -> sysret_check_SYSCALL_EXIT
int_signal       -> sysret_check_DO_NOTIFY_MASK
int_restore_rest -> sysret_next_check

ret_from_intr    -> return_from_intr
retint_with_reschedule -> intr_check_WORK_MASK
retint_check     -> intr_check_workmask_in_edi
retint_careful   -> intr_check_NEED_RESCHED
retint_signal    -> intr_check_DO_NOTIFY_MASK

retint_swapgs    -> return_from_syscall_or_intr
irq_return_via_sysret -> return_via_sysret

retint_kernel    -> intr_check_preempt
restore_args     -> restore_c_regs
irq_return       -> return_via_iret


and then your proposal can be rephrased as "let's stop
merging sysret and intr code paths at retint_swapgs".

Makes sense. It would entail some code duplication,
but the code will be easier to maintain.

>   - we can optimize in a more directed fashion - like here
> 
> ... while the downsides are:
> 
>   - more code
>   - a (small) chance of a fix going to one path while not the other.
> 
> How much extra code would it be?

A screenful or two.

  reply	other threads:[~2015-04-02 12:59 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-01 21:26 [PATCH urgent v2] x86, asm: Disable opportunistic SYSRET if regs->flags has TF set Andy Lutomirski
2015-04-02  6:21 ` Borislav Petkov
2015-04-02  9:07 ` Ingo Molnar
2015-04-02 10:07   ` Denys Vlasenko
2015-04-02 10:37     ` Ingo Molnar
2015-04-02 11:14       ` Brian Gerst
2015-04-02 12:24         ` Denys Vlasenko
2015-04-02 12:31           ` Ingo Molnar
2015-04-02 12:59             ` Denys Vlasenko [this message]
2015-04-02 15:49               ` Denys Vlasenko
2015-04-02 16:08                 ` Ingo Molnar
2015-04-02 14:26             ` Andy Lutomirski
2015-04-02 12:32 ` [tip:x86/urgent] x86/asm/entry/64: " tip-bot for Andy Lutomirski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=551D3D2A.2040802@redhat.com \
    --to=dvlasenk@redhat.com \
    --cc=bp@alien8.de \
    --cc=bp@suse.de \
    --cc=brgerst@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.