From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from desiato.infradead.org (desiato.infradead.org [90.155.92.199]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BC7E415218A; Thu, 11 Jul 2024 08:10:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=90.155.92.199 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1720685459; cv=none; b=IOdyr3jXttBkA3ba1/yw0X7/0a+DMqjm0vlMeAu3nR58z639erAIs3kPyGDhvib6wPbjUaV84pZHrn15840GuKvpBfFMxehcOrZugTzjh1ehHwm5i8QmDt5X0FS/F4ZAMxlKgHoQ3QBhOxZUkyrD72j1S+UE6wX1xLX4rUodIis= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1720685459; c=relaxed/simple; bh=/HTkHdqYMypFj69lROlJruJZAQSVM6krQHth0+z0JoE=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=CFlciTjkuvlvQFuImjpEuClw1aUZPP5Ao2K0MrrcWd0CWtqv1/CGQ9qPJMCMUn+TtPXA13kQ1nJgo4MM0zzC7L9kpLEnstKzXgpBEA6eLFIwjVZRDQuQyu5OrR4heP8c+/LixnFo8NX9fTW9I62ZuNLoilpZ+SFjdaCaEviQ5r8= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=b0XQmENp; arc=none smtp.client-ip=90.155.92.199 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="b0XQmENp" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=tDeiiGMS+Rb6GTSNg6AylsTFuSIZN3CBnV/fJ+EfT/g=; b=b0XQmENpAQWL/ZC1NIcpoHI2MV jkYzu6bqJ3qGTy/JD8aT5kdGJqV1tdXSFwFtly4Bk7nBiq+JRCHVXixWA1vuh+DZcT1NEf9tugW5T TrLzj9wCteRIlbqSSi7EKdChG918QJYZoc3I3DjxTLqXUF+GW37p1HLY3BFUNNnlKx7ONhhIXEZps wCWNxiuhxvJ5/rmCU54yq3M643WQb1LE9P+e6orBOjyZJOP0XHUPUayJhwRpIEnVu8Uu9KnaLwca2 9F0/CRJwQbCR0Qupg8tfn33iYPYw12NVWmKk095aGFAwuHXrUdIjq1e9vyjuSoa1iPZpgfTO8kCg/ qpS33cbA==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.97.1 #2 (Red Hat Linux)) id 1sRosq-000000013Xo-1VwT; Thu, 11 Jul 2024 08:10:32 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 1000) id D600B30050D; Thu, 11 Jul 2024 10:10:31 +0200 (CEST) Date: Thu, 11 Jul 2024 10:10:31 +0200 From: Peter Zijlstra To: Gatlin Newhouse Cc: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , Kees Cook , Marco Elver , Andrey Konovalov , Andrey Ryabinin , Nathan Chancellor , Nick Desaulniers , Bill Wendling , Justin Stitt , Andrew Morton , Baoquan He , Rick Edgecombe , Pengfei Xu , Josh Poimboeuf , Changbin Du , Xin Li , Jason Gunthorpe , Arnd Bergmann , "Kirill A. Shutemov" , linux-kernel@vger.kernel.org, kasan-dev@googlegroups.com, linux-hardening@vger.kernel.org, llvm@lists.linux.dev Subject: Re: [PATCH v4] x86/traps: Enable UBSAN traps on x86 Message-ID: <20240711081031.GB4587@noisy.programming.kicks-ass.net> References: <20240710203250.238782-1-gatlin.newhouse@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240710203250.238782-1-gatlin.newhouse@gmail.com> On Wed, Jul 10, 2024 at 08:32:38PM +0000, Gatlin Newhouse wrote: > Currently ARM architectures extract which specific sanitizer > has caused a trap via encoded data in the trap instruction. > Clang on x86 currently encodes the same data in ud1 instructions > but the x86 handle_bug() and is_valid_bugaddr() functions > currently only look at ud2s. > > Bring x86 to parity with arm64, similar to commit 25b84002afb9 > ("arm64: Support Clang UBSAN trap codes for better reporting"). > Enable the reporting of UBSAN sanitizer detail on x86 architectures > compiled with clang when CONFIG_UBSAN_TRAP=y. Can we please get some actual words on what code clang will generate for this? This doesn't even refer to the clang commit. How am I supposed to know if the below patch matches what clang will generate etc.. > diff --git a/arch/x86/include/asm/bug.h b/arch/x86/include/asm/bug.h > index a3ec87d198ac..ccd573d58edb 100644 > --- a/arch/x86/include/asm/bug.h > +++ b/arch/x86/include/asm/bug.h > @@ -13,6 +13,17 @@ > #define INSN_UD2 0x0b0f > #define LEN_UD2 2 > > +/* > + * In clang we have UD1s reporting UBSAN failures on X86, 64 and 32bit. > + */ > +#define INSN_ASOP 0x67 I asked, but did not receive answer, *WHY* does clang add this silly prefix? AFAICT this is entirely spurious and things would be simpler if we don't have to deal with it. > +#define OPCODE_PREFIX 0x0f This is *NOT* a prefix, it is an escape, please see the SDM Vol 2 Chapter 'Instruction Format'. That ASOP thing above is a prefix. > +#define OPCODE_UD1 0xb9 > +#define OPCODE_UD2 0x0b These are second byte opcodes. The actual (single byte opcodes) of those value exist and are something entirely different (0xB0+r is MOV, and 0x0B is OR). > +#define BUG_NONE 0xffff > +#define BUG_UD1 0xfffe > +#define BUG_UD2 0xfffd These are return codes and not related to the defines above and as such should be separated from them with some whitespace. > + > #ifdef CONFIG_GENERIC_BUG > > #ifdef CONFIG_X86_32 > diff --git a/arch/x86/include/asm/ubsan.h b/arch/x86/include/asm/ubsan.h > new file mode 100644 > index 000000000000..ac2080984e83 > --- /dev/null > +++ b/arch/x86/include/asm/ubsan.h > @@ -0,0 +1,23 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > +#ifndef _ASM_X86_UBSAN_H > +#define _ASM_X86_UBSAN_H > + > +/* > + * Clang Undefined Behavior Sanitizer trap mode support. > + */ > +#include > +#include > +#include > + > +/* > + * UBSAN uses the EAX register to encode its type in the ModRM byte. This is a claim, but I have nothing to verify this against. I mean, I could go trawl through the clang sources, but this really should be part of the changelog to explain the clang code generation. > + */ > +#define UBSAN_REG 0x40 This is a ModRM byte, not a REG. The REG encoded therein is 0. > + > +#ifdef CONFIG_UBSAN_TRAP > +void handle_ubsan_failure(struct pt_regs *regs, u16 insn); > +#else > +static inline void handle_ubsan_failure(struct pt_regs *regs, u16 insn) { return; } > +#endif /* CONFIG_UBSAN_TRAP */ > + > +#endif /* _ASM_X86_UBSAN_H */ > diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile > index 74077694da7d..fe1d9db27500 100644 > --- a/arch/x86/kernel/Makefile > +++ b/arch/x86/kernel/Makefile > @@ -145,6 +145,7 @@ obj-$(CONFIG_UNWINDER_GUESS) += unwind_guess.o > obj-$(CONFIG_AMD_MEM_ENCRYPT) += sev.o > > obj-$(CONFIG_CFI_CLANG) += cfi.o > +obj-$(CONFIG_UBSAN_TRAP) += ubsan.o > > obj-$(CONFIG_CALL_THUNKS) += callthunks.o > > diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c > index 4fa0b17e5043..b6664016622a 100644 > --- a/arch/x86/kernel/traps.c > +++ b/arch/x86/kernel/traps.c > @@ -67,6 +67,7 @@ > #include > #include > #include > +#include > > #ifdef CONFIG_X86_64 > #include > @@ -91,6 +92,45 @@ __always_inline int is_valid_bugaddr(unsigned long addr) > return *(unsigned short *)addr == INSN_UD2; > } > > +/* > + * Check for UD1 or UD2, accounting for Address Size Override Prefixes. > + * If it's a UD1, get the ModRM byte to pass along to UBSan. > + */ > +__always_inline int decode_bug(unsigned long addr, u32 *imm) > +{ > + u8 v; > + > + if (addr < TASK_SIZE_MAX) > + return BUG_NONE; > + > + v = *(u8 *)(addr++); > + if (v == INSN_ASOP) > + v = *(u8 *)(addr++); > + if (v != OPCODE_PREFIX) > + return BUG_NONE; > + > + v = *(u8 *)(addr++); > + if (v == OPCODE_UD2) > + return BUG_UD2; > + if (v != OPCODE_UD1) > + return BUG_NONE; > + > + v = *(u8 *)(addr++); > + if (X86_MODRM_RM(v) == 4) > + addr++; > + > + *imm = 0; > + if (X86_MODRM_MOD(v) == 1) > + *imm = *(u8 *)addr; > + else if (X86_MODRM_MOD(v) == 2) > + *imm = *(u32 *)addr; > + else > + WARN_ONCE(1, "Unexpected MODRM_MOD: %u\n", X86_MODRM_MOD(v)); > + > + return BUG_UD1; > +} > + > + > static nokprobe_inline int > do_trap_no_signal(struct task_struct *tsk, int trapnr, const char *str, > struct pt_regs *regs, long error_code) > @@ -216,6 +256,8 @@ static inline void handle_invalid_op(struct pt_regs *regs) > static noinstr bool handle_bug(struct pt_regs *regs) > { > bool handled = false; > + int ud_type; > + u32 imm; > > /* > * Normally @regs are unpoisoned by irqentry_enter(), but handle_bug() > @@ -223,7 +265,8 @@ static noinstr bool handle_bug(struct pt_regs *regs) > * irqentry_enter(). > */ > kmsan_unpoison_entry_regs(regs); > - if (!is_valid_bugaddr(regs->ip)) > + ud_type = decode_bug(regs->ip, &imm); > + if (ud_type == BUG_NONE) > return handled; > > /* > @@ -236,10 +279,14 @@ static noinstr bool handle_bug(struct pt_regs *regs) > */ > if (regs->flags & X86_EFLAGS_IF) > raw_local_irq_enable(); > - if (report_bug(regs->ip, regs) == BUG_TRAP_TYPE_WARN || > - handle_cfi_failure(regs) == BUG_TRAP_TYPE_WARN) { > - regs->ip += LEN_UD2; > - handled = true; > + if (ud_type == BUG_UD2) { > + if (report_bug(regs->ip, regs) == BUG_TRAP_TYPE_WARN || > + handle_cfi_failure(regs) == BUG_TRAP_TYPE_WARN) { > + regs->ip += LEN_UD2; > + handled = true; > + } > + } else { > + handle_ubsan_failure(regs, imm); > } > if (regs->flags & X86_EFLAGS_IF) > raw_local_irq_disable(); > diff --git a/arch/x86/kernel/ubsan.c b/arch/x86/kernel/ubsan.c > new file mode 100644 > index 000000000000..c90e337a1b6a > --- /dev/null > +++ b/arch/x86/kernel/ubsan.c > @@ -0,0 +1,21 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * Clang Undefined Behavior Sanitizer trap mode support. > + */ > +#include > +#include > +#include > +#include > +#include > +#include > + > +/* > + * Checks for the information embedded in the UD1 trap instruction > + * for the UB Sanitizer in order to pass along debugging output. > + */ > +void handle_ubsan_failure(struct pt_regs *regs, u16 type) > +{ > + if ((type & 0xFF) == UBSAN_REG) > + type >>= 8; This makes no sense, we've consumed the ModRM byte ealier, this should really only ever get the immediate. > + pr_crit("%s at %pS\n", report_ubsan_failure(regs, type), (void *)regs->ip); > +} > -- > 2.25.1 >