From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E69E0FA373D for ; Fri, 21 Oct 2022 15:58:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:Message-ID:Subject:Cc:To: From:Date:Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To:References: List-Owner; bh=gcWBpmHifnr8NM0POaI2ZTH7QWMrNuoRdG+KEpE56Kc=; b=zZiMYQEmNU+ct/ AkspcosSPKswirWfvFPjGE5qzNHt7To7MXHdeD1yru9hlGhXfIiKJcM63TfwJq855AHEXiIZWXvYm bgwXCzQrTGq6i9vu+6tSVLxq3vv05UKjJ/GRLcbxSPWhEH+0f0gaobhrRSyBHf+cjKoQWSwDRkoi4 LADgcWoCLCnZDXtKvhhxKaYbYh9PrymwyEsixqhPlVg1CVb3pYkqjiWIE8n/5H+PiTcr0LWIZrS9x ux+CzxZpyY8gM2olqWbQM0NBjC4v+dV7VbJBXIbtiq4KDPq0HJ23BBQJJUTOUuYDgxljb94ZaC6g5 Y2hqGAvtZa6nl2xp34KQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1oluOS-008mxy-Ao; Fri, 21 Oct 2022 15:57:09 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1oluO0-008mZ5-QG for linux-arm-kernel@lists.infradead.org; Fri, 21 Oct 2022 15:56:43 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 96D9B1063; Fri, 21 Oct 2022 08:56:37 -0700 (PDT) Received: from FVFF77S0Q05N (unknown [10.57.6.231]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 9017B3F792; Fri, 21 Oct 2022 08:56:29 -0700 (PDT) Date: Fri, 21 Oct 2022 16:56:20 +0100 From: Mark Rutland To: llvm@lists.linux.dev Cc: linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, Fangrui Song , Joao Moreira , Josh Poimboeuf , Kees Cook , Nathan Chancellor , Nick Desaulniers , Peter Zijlstra , Sami Tolvanen Subject: kCFI && patchable-function-entry=M,N Message-ID: MIME-Version: 1.0 Content-Disposition: inline X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20221021_085640_996665_F222BE6E X-CRM114-Status: GOOD ( 14.28 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Hi, For arm64, I'd like to use -fatchable-function-entry=M,N (where N > 0), for our ftrace implementation, which instruments *some* but not all functions. Unfortuntately, this doesn't play nicely with -fsanitize=kcfi, as instrumented and non-instrumented functions don't agree on where the type hash should live relative to the function entry point, making them incompatible with one another. AFAICT, there's no mechanism today to get them to agree. Today we use -fatchable-function-entry=2, which happens to avoid this. For example, in the below, functions with N=0 expect the type hash to be placed 4 bytes before the entry point, and functions with N=2 expect the type hash to be placed 12 bytes before the entry point. | % cat test.c | #define __notrace __attribute__((patchable_function_entry(0, 0))) | | void callee_patchable(void) | { | } | | void __notrace callee_non_patchable(void) | { | } | | typedef void (*callee_fn_t)(void); | | void caller_patchable(callee_fn_t callee) | { | callee(); | } | | void __notrace caller_non_patchable(callee_fn_t callee) | { | callee(); | } | % clang --target=aarch64-linux -c test.c -fpatchable-function-entry=2,2 -fsanitize=kcfi -O2 | % aarch64-linux-objdump -d test.o | | test.o: file format elf64-littleaarch64 | | | Disassembly of section .text: | | 0000000000000000 : | 0: a540670c .word 0xa540670c | 4: d503201f nop | 8: d503201f nop | | 000000000000000c : | c: d65f03c0 ret | 10: a540670c .word 0xa540670c | | 0000000000000014 : | 14: d65f03c0 ret | 18: 07d85f31 .word 0x07d85f31 | 1c: d503201f nop | 20: d503201f nop | | 0000000000000024 : | 24: b85f4010 ldur w16, [x0, #-12] | 28: 728ce191 movk w17, #0x670c | 2c: 72b4a811 movk w17, #0xa540, lsl #16 | 30: 6b11021f cmp w16, w17 | 34: 54000040 b.eq 3c // b.none | 38: d4304400 brk #0x8220 | 3c: d61f0000 br x0 | 40: 07d85f31 .word 0x07d85f31 | | 0000000000000044 : | 44: b85fc010 ldur w16, [x0, #-4] | 48: 728ce191 movk w17, #0x670c | 4c: 72b4a811 movk w17, #0xa540, lsl #16 | 50: 6b11021f cmp w16, w17 | 54: 54000040 b.eq 5c // b.none | 58: d4304400 brk #0x8220 | 5c: d61f0000 br x0 On arm64, I'd like to use -fpatchable-function-entry=4,2 on arm64, along with -falign-functions=8, so that we can place a naturally-aligned 8-byte literal before the function (e.g. a pointer value). That allows us to implement an efficient per-callsite hook without hitting branch range limitations and/or requiring trampolines. I have a PoC that works without kCFI at: https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git/log/?h=arm64/ftrace/per-callsite-ops I mentioned this in the #clangbuiltlinux IRQ channel, and Sami wrote up a github issue at: https://github.com/ClangBuiltLinux/linux/issues/1744 Currently clang generates: | < HASH > | NOP | NOP | func: BTI // optional | NOP | NOP ... and to make this consistent with non-instrumented functions, the non-instrumented functions would need pre-function padding before their hashes. For my use-case, it doesn't matter where the pre-function NOPs are placed relative to the type hash, so long as the location is consistent, and it might be nicer to have the option to place the pre-function NOPs before the hash, which would avoid needing to change non-instrumented functions (and save some space) e.g. | NOP | NOP | < HASH > | func: BTI // optional | NOP | NOP ... but I understand that for x86, folk want the pre-function NOPs to fall-through into the body of the function. Is there any mechanism today that we could use to solve this, or could we extend clang to have some options to control this behaviour? It would also be helpful to have a symbol before both the hash and pre-function NOPs so that we can filter those out of probes patching (I see that x86 does this with the __cfi_function symbol). Thanks, Mark. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel