public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Tony Luck <tony.luck@intel.com>
To: Borislav Petkov <bp@alien8.de>
Cc: Tony Luck <tony.luck@intel.com>,
	Youquan Song <youquan.song@intel.com>,
	x86@kernel.org, linux-kernel@vger.kernel.org
Subject: [PATCH v3 0/6] Add machine check recovery when copying from user space
Date: Tue,  6 Oct 2020 14:09:04 -0700	[thread overview]
Message-ID: <20201006210910.21062-1-tony.luck@intel.com> (raw)
In-Reply-To: <20201005163130.GD21151@zn.tnic>

Machine check recovery from uncorrected memory errors currently focusses
primarily on errors that are detected while running in user mode. There
is a mechanism for recovering from errors in kernel code, but it is
currently only used for memcpy_mcsafe().

The existing recover actions for errors found in user mode (unmap the
page and send SIGBUS to the task) can also be applied when the error is
found while copying data from user space to the kernel.

Roadmap to this series:

Series is based on top of tip ras/core branch because original part 0001
has already been applied to tip ras/core branch:
13c877f4b48b ("x86/mce: Stop mce_reign() from re-computing severity for every CPU")
so new part 0001 below used to be part 0002 in v1 series.


In v3 part 0005 has been merged with the final part, so now just 6 parts.

0001:   First piece of infrastructure update. Severity calculations need
        access to the saved registers. So pass pointer down the call
        chain.

0002:   Need to know what type of exception handler is present
        for a given kernel instruction. Rather than proliferate more
        functions like ex_has_fault_handler() for each type, replace
        with a function that looks up the handler and returns an enum
        describing the type.

0003:   Need slightly different handling for *copy_user*() faults from
        get_user() faults. Create a new exception table tag and apply
        to the copy functions.

Change since v2: Reword commit message to avoid use of "we".

0004:   In fixup path of copy functions avoid dealing with the tail
        when the copy took a machine check by returning that there
        are no bytes left to be copied.

0005:   Changes to do_machine_check() to support the new recovery flow.
        Some re-factoring to avoid code duplication (since the flows
        for "error in user mode" and "error while copying from user
        mode" are almost identical). Couple of new fields added to the
        task structure.

Change since v2: Boris supplied a helper function to make the re-factor
	much simpler. Use it instead of the spaghetti code in v2.

0006:	Finally the keystone patch that pulls all the parts together.
	An instruction decoder figures out whether an instruction
	tagged as accessing user space is reading from or writing
	to user space. The instructions in the switch were found
	experimentally by looking at what instructions in the base
	kernel are tagged in the exception table. I didn't add the
	atomic operations (0x87 = XCHG etc.) that both read and write
	user addresses. I think they should be safe, but I need a test
	case where a futex has been poisoned to check. Probably this
	switch should be expanded with all the instructions that the
	compiler could possibly generate that read from user space.

Change since v2: Merged old part 0005 into this piece since this is
	where function fault_in_kernel_space() is used.
	Check modrm.got and sib.got fields in "insn" were set before
	calling insn_get_addr()
	Change type of constant from ~0ul to -1l when checking whether
	address returned by insn_get_addr() is valid.


Tony Luck (4):
  x86/mce: Provide method to find out the type of exception handle
  x86/mce: Avoid tail copy when machine check terminated a copy from
    user
  x86/mce: Recover from poison found while copying from user space
  x86/mce: Decode a kernel instruction to determine if it is copying
    from user

Youquan Song (2):
  x86/mce: Pass pointer to saved pt_regs to severity calculation
    routines
  x86/mce: Add _ASM_EXTABLE_CPY for copy user access

 arch/x86/include/asm/asm.h         |   6 ++
 arch/x86/include/asm/extable.h     |   9 ++-
 arch/x86/include/asm/mce.h         |  15 ++++
 arch/x86/include/asm/traps.h       |   2 +
 arch/x86/kernel/cpu/mce/core.c     |  52 +++++++++-----
 arch/x86/kernel/cpu/mce/internal.h |   3 +-
 arch/x86/kernel/cpu/mce/severity.c |  70 ++++++++++++++++--
 arch/x86/lib/copy_user_64.S        | 111 ++++++++++++++++-------------
 arch/x86/mm/extable.c              |  24 +++++--
 arch/x86/mm/fault.c                |   2 +-
 include/linux/sched.h              |   2 +
 11 files changed, 217 insertions(+), 79 deletions(-)


base-commit: 5da8e4a658109e3b7e1f45ae672b7c06ac3e7158
-- 
2.21.1


  reply	other threads:[~2020-10-06 21:09 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20200908175519.14223-1-tony.luck@intel.com>
2020-09-08 17:55 ` [PATCH 1/8] x86/mce: Stop mce_reign() from re-computing severity for every CPU Tony Luck
2020-09-14 17:21   ` Borislav Petkov
2020-09-14 17:32   ` [tip: ras/core] " tip-bot2 for Tony Luck
2020-09-08 17:55 ` [PATCH 4/8] x86/mce: Add _ASM_EXTABLE_CPY for copy user access Tony Luck
2020-09-16  9:59   ` Borislav Petkov
2020-09-08 17:55 ` [PATCH 5/8] x86/mce: Avoid tail copy when machine check terminated a copy from user Tony Luck
2020-09-16 10:53   ` Borislav Petkov
2020-09-16 19:26     ` Luck, Tony
2020-09-17 17:04       ` Borislav Petkov
2020-09-17 21:57         ` Luck, Tony
2020-09-18  7:51           ` Borislav Petkov
2020-09-08 17:55 ` [PATCH 6/8] x86/mce: Change fault_in_kernel_space() from static to global Tony Luck
2020-09-08 17:55 ` [PATCH 7/8] x86/mce: Recover from poison found while copying from user space Tony Luck
2020-09-18 16:13   ` Borislav Petkov
2020-09-08 17:55 ` [PATCH 8/8] x86/mce: Decode a kernel instruction to determine if it is copying from user Tony Luck
2020-09-21 11:31   ` Borislav Petkov
2020-09-30 23:26     ` [PATCH v2 0/7] Add machine check recovery when copying from user space Tony Luck
2020-09-30 23:26       ` [PATCH v2 1/7] x86/mce: Pass pointer to saved pt_regs to severity calculation routines Tony Luck
2020-09-30 23:26       ` [PATCH v2 2/7] x86/mce: Provide method to find out the type of exception handle Tony Luck
2020-10-05 16:35         ` Borislav Petkov
2020-09-30 23:26       ` [PATCH v2 3/7] x86/mce: Add _ASM_EXTABLE_CPY for copy user access Tony Luck
2020-10-05 16:34         ` Borislav Petkov
2020-09-30 23:26       ` [PATCH v2 4/7] x86/mce: Avoid tail copy when machine check terminated a copy from user Tony Luck
2020-09-30 23:26       ` [PATCH v2 5/7] x86/mce: Change fault_in_kernel_space() from static to global Tony Luck
2020-10-05 16:33         ` Borislav Petkov
2020-09-30 23:26       ` [PATCH v2 6/7] x86/mce: Recover from poison found while copying from user space Tony Luck
2020-10-05 16:32         ` Borislav Petkov
2020-10-05 17:47           ` Luck, Tony
2020-09-30 23:26       ` [PATCH v2 7/7] x86/mce: Decode a kernel instruction to determine if it is copying from user Tony Luck
2020-10-05 16:31         ` Borislav Petkov
2020-10-06 21:09           ` Tony Luck [this message]
2020-10-06 21:09             ` [PATCH v3 1/6] x86/mce: Pass pointer to saved pt_regs to severity calculation routines Tony Luck
2020-10-07 10:02               ` [tip: ras/core] " tip-bot2 for Youquan Song
2020-10-06 21:09             ` [PATCH v3 2/6] x86/mce: Provide method to find out the type of exception handle Tony Luck
2020-10-07 10:02               ` [tip: ras/core] x86/mce: Provide method to find out the type of an exception handler tip-bot2 for Tony Luck
2020-10-06 21:09             ` [PATCH v3 3/6] x86/mce: Add _ASM_EXTABLE_CPY for copy user access Tony Luck
2020-10-07 10:02               ` [tip: ras/core] " tip-bot2 for Youquan Song
2020-10-06 21:09             ` [PATCH v3 4/6] x86/mce: Avoid tail copy when machine check terminated a copy from user Tony Luck
2020-10-07  8:23               ` David Laight
2020-10-07 18:49                 ` Luck, Tony
2020-10-07 21:11                   ` David Laight
2020-10-07 10:02               ` [tip: ras/core] " tip-bot2 for Tony Luck
2020-10-06 21:09             ` [PATCH v3 5/6] x86/mce: Recover from poison found while copying from user space Tony Luck
2020-10-07 10:02               ` [tip: ras/core] " tip-bot2 for Tony Luck
2020-10-06 21:09             ` [PATCH v3 6/6] x86/mce: Decode a kernel instruction to determine if it is copying from user Tony Luck
2020-10-07 10:02               ` [tip: ras/core] " tip-bot2 for Tony Luck
2020-09-09 15:05 ` [RESEND PATCH 0/8] Add machine check recovery when copying from user space Tony Luck
     [not found] ` <20200908175519.14223-4-tony.luck@intel.com>
2020-09-15  9:11   ` [PATCH 3/8] x86/mce: Provide method to find out the type of exception handle Borislav Petkov
2020-09-15 16:24     ` Luck, Tony

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201006210910.21062-1-tony.luck@intel.com \
    --to=tony.luck@intel.com \
    --cc=bp@alien8.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=x86@kernel.org \
    --cc=youquan.song@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox