* [PATCH] x86/mce: Reduce number of machine checks taken during recovery @ 2021-12-15 22:20 Tony Luck 2021-12-18 0:53 ` Peter Zijlstra 0 siblings, 1 reply; 6+ messages in thread From: Tony Luck @ 2021-12-15 22:20 UTC (permalink / raw) To: Borislav Petkov; +Cc: x86, linux-kernel, patches, Youquan Song, Tony Luck From: Youquan Song <youquan.song@intel.com> A machine check recovery test case injects poison into memory and then tries a write(2) system call with a buffer that begins 64 bytes before the poison and includes the first poison byte: $ sudo ./einj_mem_uc -f -m 64:65:0 copyin 0: copyin vaddr = 0x7f13639cc400 paddr = 834fa0d400 ./einj_mem_uc: short (64 bytes) write to temp file page not present Big surprise ... still running. Thought that would be fatal Unusual number of MCEs seen: 4 Test passed While it executes correctly (just the 64 non-poison bytes are written) and recovers, four machine checks seems like a lot. When any of the copy functions in arch/x86/lib/copy_user_64.S consume poison and take a fault the fixup code copies the remaining byte count from %ecx to %edx and unconditionally jumps to .Lcopy_user_handle_tail to continue the copy in case any more bytes can be copied. If the fault was #PF this may copy more bytes (because the page fault handler might have fixed the fault). But when the fault is a machine check the original copy code will have copied all the way to the poisoned cache line. So .Lcopy_user_handle_tail will just take another machine check for no good reason. There are five distinct blocks of fixup code that branch to .Lcopy_user_handle_tail, so add a check there to check the trap type (in %eax) and simply return the count of remaining bytes. As well as reducing the number of machine checks, this also allows Skylake generation Xeons to recover some cases that currently fail. The is because "rep movsb" is only recoverable when source and destination are well aligned and the byte count is large. That useless call to .Lcopy_user_handle_tail may violate one or more of these conditions and generate a fatal machine check. With this change applied the count of machine checks to recover is reduced from four to three: $ sudo ./einj_mem_uc -f -m 64:65:0 copyin 0: copyin vaddr = 0x7f1e44845400 paddr = 55930b400 ./einj_mem_uc: short (64 bytes) write to temp file page not present Big surprise ... still running. Thought that would be fatal Unusual number of MCEs seen: 3 Test passed [Tony: Added more details to commit message] Signed-off-by: Youquan Song <youquan.song@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> --- arch/x86/lib/copy_user_64.S | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/arch/x86/lib/copy_user_64.S b/arch/x86/lib/copy_user_64.S index 2797e630b9b1..8c53be99faa0 100644 --- a/arch/x86/lib/copy_user_64.S +++ b/arch/x86/lib/copy_user_64.S @@ -233,12 +233,19 @@ EXPORT_SYMBOL(copy_user_enhanced_fast_string) * eax uncopied bytes or 0 if successful. */ SYM_CODE_START_LOCAL(.Lcopy_user_handle_tail) + cmp $X86_TRAP_MC,%eax + je 3f movl %edx,%ecx 1: rep movsb 2: mov %ecx,%eax ASM_CLAC ret +3: + movl %edx,%eax + ASM_CLAC + ret + _ASM_EXTABLE_CPY(1b, 2b) SYM_CODE_END(.Lcopy_user_handle_tail) -- 2.31.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] x86/mce: Reduce number of machine checks taken during recovery 2021-12-15 22:20 [PATCH] x86/mce: Reduce number of machine checks taken during recovery Tony Luck @ 2021-12-18 0:53 ` Peter Zijlstra 2021-12-23 20:07 ` [PATCH v2] " Luck, Tony 0 siblings, 1 reply; 6+ messages in thread From: Peter Zijlstra @ 2021-12-18 0:53 UTC (permalink / raw) To: Tony Luck; +Cc: Borislav Petkov, x86, linux-kernel, patches, Youquan Song On Wed, Dec 15, 2021 at 02:20:16PM -0800, Tony Luck wrote: > --- > arch/x86/lib/copy_user_64.S | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/arch/x86/lib/copy_user_64.S b/arch/x86/lib/copy_user_64.S > index 2797e630b9b1..8c53be99faa0 100644 > --- a/arch/x86/lib/copy_user_64.S > +++ b/arch/x86/lib/copy_user_64.S > @@ -233,12 +233,19 @@ EXPORT_SYMBOL(copy_user_enhanced_fast_string) > * eax uncopied bytes or 0 if successful. > */ > SYM_CODE_START_LOCAL(.Lcopy_user_handle_tail) > + cmp $X86_TRAP_MC,%eax > + je 3f > movl %edx,%ecx > 1: rep movsb > 2: mov %ecx,%eax > ASM_CLAC > ret > > +3: > + movl %edx,%eax > + ASM_CLAC > + ret > + > _ASM_EXTABLE_CPY(1b, 2b) > SYM_CODE_END(.Lcopy_user_handle_tail) This patch no longer applies; however, you forgot to change the comment on top about the calling convention, because now the function expects rax to contain the trap number. It's also not obvious from the massive rambling on top that all callsites were audited to make sure this is in fact true (it appears so). ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v2] x86/mce: Reduce number of machine checks taken during recovery 2021-12-18 0:53 ` Peter Zijlstra @ 2021-12-23 20:07 ` Luck, Tony 2021-12-24 11:50 ` Peter Zijlstra 2021-12-31 17:34 ` [tip: ras/core] " tip-bot2 for Youquan Song 0 siblings, 2 replies; 6+ messages in thread From: Luck, Tony @ 2021-12-23 20:07 UTC (permalink / raw) To: Peter Zijlstra; +Cc: Borislav Petkov, x86, linux-kernel, patches, Youquan Song From: Youquan Song <youquan.song@intel.com> When any of the copy functions in arch/x86/lib/copy_user_64.S take a fault, the fixup code copies the remaining byte count from %ecx to %edx and unconditionally jumps to .Lcopy_user_handle_tail to continue the copy in case any more bytes can be copied. If the fault was #PF this may copy more bytes (because the page fault handler might have fixed the fault). But when the fault is a machine check the original copy code will have copied all the way to the poisoned cache line. So .Lcopy_user_handle_tail will just take another machine check for no good reason. Every code path to .Lcopy_user_handle_tail comes from an exception fixup path, so add a check there to check the trap type (in %eax) and simply return the count of remaining bytes if the trap was a machine check. Doing this reduces the number of machine checks taken during synthetic tests from four to three. As well as reducing the number of machine checks, this also allows Skylake generation Xeons to recover some cases that currently fail. The is because "rep movsb" is only recoverable when source and destination are well aligned and the byte count is large. That useless call to .Lcopy_user_handle_tail may violate one or more of these conditions and generate a fatal machine check. [Tony: Added more details to commit message] Signed-off-by: Youquan Song <youquan.song@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> --- arch/x86/lib/copy_user_64.S | 9 +++++++++ 1 file changed, 9 insertions(+) V2: Update based on comments from PeterZ - Rebase to tip x86/core branch - Add comment documenting eax is now part of calling convention for .Lcopy_user_handle_tail - Trim commit comment to make it easier to see that I did audit all the call sequences for .Lcopy_user_handle_tail diff --git a/arch/x86/lib/copy_user_64.S b/arch/x86/lib/copy_user_64.S index e6ac38587b40..26781cbe7e37 100644 --- a/arch/x86/lib/copy_user_64.S +++ b/arch/x86/lib/copy_user_64.S @@ -212,6 +212,7 @@ EXPORT_SYMBOL(copy_user_enhanced_fast_string) * Don't try to copy the tail if machine check happened * * Input: + * eax x86 trap number - set by fixup_excpetion() * rdi destination * rsi source * rdx count @@ -220,12 +221,20 @@ EXPORT_SYMBOL(copy_user_enhanced_fast_string) * eax uncopied bytes or 0 if successful. */ SYM_CODE_START_LOCAL(.Lcopy_user_handle_tail) + cmp $X86_TRAP_MC,%eax + je 3f + movl %edx,%ecx 1: rep movsb 2: mov %ecx,%eax ASM_CLAC RET +3: + movl %edx,%eax + ASM_CLAC + RET + _ASM_EXTABLE_CPY(1b, 2b) .Lcopy_user_handle_align: base-commit: 82a8954acd93ae95d6252fb93a3d210c8f71b093 -- 2.31.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v2] x86/mce: Reduce number of machine checks taken during recovery 2021-12-23 20:07 ` [PATCH v2] " Luck, Tony @ 2021-12-24 11:50 ` Peter Zijlstra 2021-12-27 10:52 ` Borislav Petkov 2021-12-31 17:34 ` [tip: ras/core] " tip-bot2 for Youquan Song 1 sibling, 1 reply; 6+ messages in thread From: Peter Zijlstra @ 2021-12-24 11:50 UTC (permalink / raw) To: Luck, Tony; +Cc: Borislav Petkov, x86, linux-kernel, patches, Youquan Song On Thu, Dec 23, 2021 at 12:07:01PM -0800, Luck, Tony wrote: > diff --git a/arch/x86/lib/copy_user_64.S b/arch/x86/lib/copy_user_64.S > index e6ac38587b40..26781cbe7e37 100644 > --- a/arch/x86/lib/copy_user_64.S > +++ b/arch/x86/lib/copy_user_64.S > @@ -212,6 +212,7 @@ EXPORT_SYMBOL(copy_user_enhanced_fast_string) > * Don't try to copy the tail if machine check happened > * > * Input: > + * eax x86 trap number - set by fixup_excpetion() That's inaccurate, fixup_exception() (event if it's spelled correctly) does not unconditionally set the trap number in RAX, that's only done by ex_handler_fault() (or ex_handler_sgx()), which means all flows into this function must pass through: EX_TYPE_FAULT, EX_TYPE_FAULT_MCE or EX_TYPE_COPY. Boris might fix up your comment if he applies I suppose.. > * rdi destination > * rsi source > * rdx count > @@ -220,12 +221,20 @@ EXPORT_SYMBOL(copy_user_enhanced_fast_string) > * eax uncopied bytes or 0 if successful. > */ > SYM_CODE_START_LOCAL(.Lcopy_user_handle_tail) > + cmp $X86_TRAP_MC,%eax > + je 3f > + > movl %edx,%ecx > 1: rep movsb > 2: mov %ecx,%eax > ASM_CLAC > RET > > +3: > + movl %edx,%eax > + ASM_CLAC > + RET > + > _ASM_EXTABLE_CPY(1b, 2b) > > .Lcopy_user_handle_align: > > base-commit: 82a8954acd93ae95d6252fb93a3d210c8f71b093 > -- > 2.31.1 > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] x86/mce: Reduce number of machine checks taken during recovery 2021-12-24 11:50 ` Peter Zijlstra @ 2021-12-27 10:52 ` Borislav Petkov 0 siblings, 0 replies; 6+ messages in thread From: Borislav Petkov @ 2021-12-27 10:52 UTC (permalink / raw) To: Peter Zijlstra, Luck, Tony; +Cc: x86, linux-kernel, patches, Youquan Song On Fri, Dec 24, 2021 at 12:50:01PM +0100, Peter Zijlstra wrote: > That's inaccurate, fixup_exception() (event if it's spelled correctly) > does not unconditionally set the trap number in RAX, that's only done by > ex_handler_fault() (or ex_handler_sgx()), which means all flows into > this function must pass through: EX_TYPE_FAULT, EX_TYPE_FAULT_MCE or > EX_TYPE_COPY. Yeah, they all flow through the copy thing, see _ASM_EXTABLE_CPY() everywhere in that file. > Boris might fix up your comment if he applies I suppose.. I did with the following changes so that tip branches merging can work relatively easily and so that I don't pull in the whole x86/core pile into ras/core. The automerge conflict resolve with tip/master is after the patch below and it looks ok to me too. Thoughts? --- From 0db3d12d3f1e8ee87a2f1cc6049a3a7c0e1f5e6c Mon Sep 17 00:00:00 2001 From: Youquan Song <youquan.song@intel.com> Date: Thu, 23 Dec 2021 12:07:01 -0800 Subject: [PATCH] x86/mce: Reduce number of machine checks taken during recovery When any of the copy functions in arch/x86/lib/copy_user_64.S take a fault, the fixup code copies the remaining byte count from %ecx to %edx and unconditionally jumps to .Lcopy_user_handle_tail to continue the copy in case any more bytes can be copied. If the fault was #PF this may copy more bytes (because the page fault handler might have fixed the fault). But when the fault is a machine check the original copy code will have copied all the way to the poisoned cache line. So .Lcopy_user_handle_tail will just take another machine check for no good reason. Every code path to .Lcopy_user_handle_tail comes from an exception fixup path, so add a check there to check the trap type (in %eax) and simply return the count of remaining bytes if the trap was a machine check. Doing this reduces the number of machine checks taken during synthetic tests from four to three. As well as reducing the number of machine checks, this also allows Skylake generation Xeons to recover some cases that currently fail. The is because REP; MOVSB is only recoverable when source and destination are well aligned and the byte count is large. That useless call to .Lcopy_user_handle_tail may violate one or more of these conditions and generate a fatal machine check. [ Tony: Add more details to commit message. ] [ bp: Fixup comment. Also, another tip patchset which is adding straight-line speculation mitigation changes the "ret" instruction to an all-caps macro "RET". But, since gas is case-insensitive, use "RET" in the newly added asm block already in order to simplify tip branch merging on its way upstream. ] Signed-off-by: Youquan Song <youquan.song@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/YcTW5dh8yTGucDd+@agluck-desk2.amr.corp.intel.com --- arch/x86/lib/copy_user_64.S | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/arch/x86/lib/copy_user_64.S b/arch/x86/lib/copy_user_64.S index 2797e630b9b1..e73b76e5bc09 100644 --- a/arch/x86/lib/copy_user_64.S +++ b/arch/x86/lib/copy_user_64.S @@ -225,6 +225,7 @@ EXPORT_SYMBOL(copy_user_enhanced_fast_string) * Don't try to copy the tail if machine check happened * * Input: + * eax trap number written by ex_handler_copy() * rdi destination * rsi source * rdx count @@ -233,12 +234,20 @@ EXPORT_SYMBOL(copy_user_enhanced_fast_string) * eax uncopied bytes or 0 if successful. */ SYM_CODE_START_LOCAL(.Lcopy_user_handle_tail) + cmp $X86_TRAP_MC,%eax + je 3f + movl %edx,%ecx 1: rep movsb 2: mov %ecx,%eax ASM_CLAC ret +3: + movl %edx,%eax + ASM_CLAC + RET + _ASM_EXTABLE_CPY(1b, 2b) SYM_CODE_END(.Lcopy_user_handle_tail) -- 2.29.2 automerge resolve with tip/master: commit a236a52e9c4b2869211aa1bb515856e625c30ac4 (HEAD -> refs/heads/test) Merge: be59580168ba 0db3d12d3f1e Author: Borislav Petkov <bp@suse.de> Date: Mon Dec 27 11:46:58 2021 +0100 Merge branch 'tip-ras-core' into test diff --cc arch/x86/lib/copy_user_64.S index e6ac38587b40,e73b76e5bc09..d4a0494e618b --- a/arch/x86/lib/copy_user_64.S +++ b/arch/x86/lib/copy_user_64.S @@@ -224,14 -241,14 +228,19 @@@ SYM_CODE_START_LOCAL(.Lcopy_user_handle 1: rep movsb 2: mov %ecx,%eax ASM_CLAC - ret + RET + 3: + movl %edx,%eax + ASM_CLAC + RET + _ASM_EXTABLE_CPY(1b, 2b) + +.Lcopy_user_handle_align: + addl %ecx,%edx /* ecx is zerorest also */ + jmp .Lcopy_user_handle_tail + SYM_CODE_END(.Lcopy_user_handle_tail) /* -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [tip: ras/core] x86/mce: Reduce number of machine checks taken during recovery 2021-12-23 20:07 ` [PATCH v2] " Luck, Tony 2021-12-24 11:50 ` Peter Zijlstra @ 2021-12-31 17:34 ` tip-bot2 for Youquan Song 1 sibling, 0 replies; 6+ messages in thread From: tip-bot2 for Youquan Song @ 2021-12-31 17:34 UTC (permalink / raw) To: linux-tip-commits Cc: Youquan Song, Tony Luck, Borislav Petkov, x86, linux-kernel The following commit has been merged into the ras/core branch of tip: Commit-ID: 3376136300a00df9a864b88fa969177d6c3be8e5 Gitweb: https://git.kernel.org/tip/3376136300a00df9a864b88fa969177d6c3be8e5 Author: Youquan Song <youquan.song@intel.com> AuthorDate: Thu, 23 Dec 2021 12:07:01 -08:00 Committer: Borislav Petkov <bp@suse.de> CommitterDate: Fri, 31 Dec 2021 18:22:32 +01:00 x86/mce: Reduce number of machine checks taken during recovery When any of the copy functions in arch/x86/lib/copy_user_64.S take a fault, the fixup code copies the remaining byte count from %ecx to %edx and unconditionally jumps to .Lcopy_user_handle_tail to continue the copy in case any more bytes can be copied. If the fault was #PF this may copy more bytes (because the page fault handler might have fixed the fault). But when the fault is a machine check the original copy code will have copied all the way to the poisoned cache line. So .Lcopy_user_handle_tail will just take another machine check for no good reason. Every code path to .Lcopy_user_handle_tail comes from an exception fixup path, so add a check there to check the trap type (in %eax) and simply return the count of remaining bytes if the trap was a machine check. Doing this reduces the number of machine checks taken during synthetic tests from four to three. As well as reducing the number of machine checks, this also allows Skylake generation Xeons to recover some cases that currently fail. The is because REP; MOVSB is only recoverable when source and destination are well aligned and the byte count is large. That useless call to .Lcopy_user_handle_tail may violate one or more of these conditions and generate a fatal machine check. [ Tony: Add more details to commit message. ] [ bp: Fixup comment. Also, another tip patchset which is adding straight-line speculation mitigation changes the "ret" instruction to an all-caps macro "RET". But, since gas is case-insensitive, use "RET" in the newly added asm block already in order to simplify tip branch merging on its way upstream. ] Signed-off-by: Youquan Song <youquan.song@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/YcTW5dh8yTGucDd+@agluck-desk2.amr.corp.intel.com --- arch/x86/lib/copy_user_64.S | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/arch/x86/lib/copy_user_64.S b/arch/x86/lib/copy_user_64.S index 2797e63..e73b76e 100644 --- a/arch/x86/lib/copy_user_64.S +++ b/arch/x86/lib/copy_user_64.S @@ -225,6 +225,7 @@ EXPORT_SYMBOL(copy_user_enhanced_fast_string) * Don't try to copy the tail if machine check happened * * Input: + * eax trap number written by ex_handler_copy() * rdi destination * rsi source * rdx count @@ -233,12 +234,20 @@ EXPORT_SYMBOL(copy_user_enhanced_fast_string) * eax uncopied bytes or 0 if successful. */ SYM_CODE_START_LOCAL(.Lcopy_user_handle_tail) + cmp $X86_TRAP_MC,%eax + je 3f + movl %edx,%ecx 1: rep movsb 2: mov %ecx,%eax ASM_CLAC ret +3: + movl %edx,%eax + ASM_CLAC + RET + _ASM_EXTABLE_CPY(1b, 2b) SYM_CODE_END(.Lcopy_user_handle_tail) ^ permalink raw reply related [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-12-31 17:34 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2021-12-15 22:20 [PATCH] x86/mce: Reduce number of machine checks taken during recovery Tony Luck 2021-12-18 0:53 ` Peter Zijlstra 2021-12-23 20:07 ` [PATCH v2] " Luck, Tony 2021-12-24 11:50 ` Peter Zijlstra 2021-12-27 10:52 ` Borislav Petkov 2021-12-31 17:34 ` [tip: ras/core] " tip-bot2 for Youquan Song
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.