From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0280DC433EF for ; Thu, 2 Dec 2021 10:17:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References: Message-ID:Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=a+e1cMb3GGBK5ffPwpYN6TL6OpElno36XPpqawh+19U=; b=nasfs81eXj1Cxy jJ+JtFF1Gp4hbdEN3f79JyP7X2+wW9phrkG4MuBIA/F3xuByZXgwXyCIFf/moqJ46Ih0+CaTLgh6I qLNsQTFxbFx4LDIy8Gp6rkquR6OC4NyqxzImb5amE7LGaPSkct7yyylwNIjhu/Ll3au/cR1jVwNTM n9Oki2iDpth8jzuQvDDu/KN5CyHcf95j8usjva+AySbthNDGSvXnakbReFie7MRFvOM0XuAGT/jq+ fYnT96O0R/pL92a+MYlgHaEEx3NejVsTxZ7j+FkVmFuqLPQohPfYUv5GRk48N8BjdxuqSRWV8Jxxo XbQBGGiPggHAyoDHPzvg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1msj8P-00BnO9-6u; Thu, 02 Dec 2021 10:16:13 +0000 Received: from mail-wm1-x329.google.com ([2a00:1450:4864:20::329]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1msj8K-00BnNM-ML for linux-arm-kernel@lists.infradead.org; Thu, 02 Dec 2021 10:16:10 +0000 Received: by mail-wm1-x329.google.com with SMTP id 137so22570737wma.1 for ; Thu, 02 Dec 2021 02:16:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to :user-agent; bh=dvo4KVBG0Lx2iX6zR/VCV2kL5rEaU3wC7t8V2k80olI=; b=nJiPFMW95wKn+lLyNIbdyV0kzUALgFXwHNktAjmi5lyAE4mWJaIvMCBE2GEaqNIW1t QcNaptb0ibXK/CnaBjgx7/Ix9I93ouNXn/eYEOnWwzLrldYiI2NfHaNjAQTtwwBQ1oTs 43Br/8/GjYexyD/bD5G3lb7o9GL8NZ6s2ongOX3pPgmxzKosorTHlhuBXsWCBQ/29Jq3 99um6aUQser1CjL5t+mhY3yn5Ohz+mf7ILcxnYL/12bbVBJ2mcj4IEsngEeseSlSDirr IeqHU4gPQbZb0bv3vYDx9PYqTDAsKqI5GyMd8xlVLHIFNSXePUoP/gOTlVtkLpjpGWto XqQg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to:user-agent; bh=dvo4KVBG0Lx2iX6zR/VCV2kL5rEaU3wC7t8V2k80olI=; b=uczjN6MK9SjZ9VDHBWHNkBaTIc7WW6SVCcK1Vfgz1TTgcAxmH3/IpIaC4sbS17xLLr C/zHHmjZuexayVf3ii1b/r038nzLZAKoLDjVYYJdtuVmbfsDblyOdKvPBZ0Yi05zx7A9 usc6lFUSKSx74VWwrZ2miJfLspS37J/OAcXIgSg2ScmQEpbWOH6shColucPRof5T7ZaP lhHt+U/Mo1X1aPYUTUe3IP6iTK63I/1sIjBhr5BW6sdQU+QoXz6/6I/A2EnSDlJCJ1En p4nH6gC0TcKSNQdI2eeayDLsDo+Thvk+H1ZXygcGDQOP5GcKh+teTZkTq75/5ISllHed qaEQ== X-Gm-Message-State: AOAM531zMrU0cb6fzc7gCkswG2jCFfsJXbdxD5xIYY3i74UNca5zLMCP Zt1pgIf6O9FuFXdW+4fOxuq5/A== X-Google-Smtp-Source: ABdhPJz4n47DOmobBdHwetvrGEaEhLW0CXYatFsUwhbNPwmG2nRlsMqF8YKokqeHziiPkR4XjFTv/Q== X-Received: by 2002:a1c:770e:: with SMTP id t14mr5077846wmi.173.1638440166839; Thu, 02 Dec 2021 02:16:06 -0800 (PST) Received: from elver.google.com ([2a00:79e0:15:13:ddd6:f3c9:b2f0:82f3]) by smtp.gmail.com with ESMTPSA id m20sm1850035wmq.11.2021.12.02.02.16.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 02 Dec 2021 02:16:06 -0800 (PST) Date: Thu, 2 Dec 2021 11:15:59 +0100 From: Marco Elver To: Kefeng Wang Cc: Mark Rutland , Catalin Marinas , Will Deacon , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, joey.gouly@arm.com Subject: Re: [PATCH v2] arm64: Enable KCSAN Message-ID: References: <20211129145732.27000-1-wangkefeng.wang@huawei.com> <811af0bc-0c99-37f6-a39a-095418b10661@huawei.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <811af0bc-0c99-37f6-a39a-095418b10661@huawei.com> User-Agent: Mutt/2.0.5 (2021-01-21) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20211202_021608_771697_483B1889 X-CRM114-Status: GOOD ( 49.06 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Thu, Dec 02, 2021 at 09:35AM +0800, Kefeng Wang wrote: > = > On 2021/12/1 19:53, Mark Rutland wrote: > > Hi Kefeng, > > = > > On Mon, Nov 29, 2021 at 10:57:32PM +0800, Kefeng Wang wrote: > > > This patch enables KCSAN for arm64, with updates to build rules > > > to not use KCSAN for several incompatible compilation units. > > > = > > > Resent GCC version(at least GCC10) made outline-atomics as the > > > default option(unlike Clang), which will cause linker errors > > > for kernel/kcsan/core.o. > > > = > > > Disables the out-of-line atomics by no-outline-atomics to fix > > > the linker errors. > > > = > > > Tested selftest and kcsan_test(built with GCC11 and Clang 13), > > > and all passed. > > Nice! > > = > > I think there are a few additional bits and pieces we'll need: > > = > > * Prior to clang 12.0.0, KCSAN would produce warnings with BTI, as I fo= und in: > > = > > https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git/commi= t/?h=3Darm64/kcsan&id=3D2d67c39ae4f619ca94d9790e09186e77922fa826 > > = > > Since BTI is in defconfig, I think arm64's Kconfig should require a = minimum > > of clang 12.0.0 to enable KCSAN. > = > I don't have different clang version to test,=A0 when check KCSAN, > = > commit eb32f9f990d9 ("kcsan: Improve some Kconfig comments") saids, > = > = > =A0=A0=A0 The compiler instruments plain compound read-write operations > =A0=A0=A0 differently (++, --, +=3D, -=3D, |=3D, &=3D, etc.), which allow= s KCSAN to > =A0=A0=A0 distinguish them from other plain accesses. This is currently > =A0=A0=A0 supported by Clang 12 or later. > = > Should we add a=A0 "depends on CLANG_VERSION >=3D 120000" KCSAN works just fine with Clang 11. Clang 12 merely improves some instrumentation, which is what this comment is about. What Mark meant is that there's a specific issue with arm64 and BTI that is fixed by Clang 12. Therefore, arm64's Kconfig will have to do select HAVE_ARCH_KCSAN if CC_IS_GCC || CLANG_VERSION >=3D 120000 > > = > > * In the past clang did not have an attribute to suppress tsan instrume= nation > > and would instrument noinstr regions. I'm not sure when clang gained= the > > relevant attribute to supress this, but we will need to depend on th= is > > existing, either based on the clang version or with a test for the a= ttribute. > > = > > (If we're lucky, clang 12.0.0 is sufficient, and we solve BTI and th= is in one > > go). > > = > > I *think* GCC always had an attribute, but I'm not certain. > > = > > Marco, is there an existing dependency somewhere for this to work on= x86? I > > thought there was an objtool pass to NOP this out, but I couldn't fi= nd it in > > mainline. If x86 is implicitly depending on a sufficiently recent ve= rsion of > > clang, we add something to the common KCSAN Kconfig for ARCH_WANTS_N= O_INSTR? > > = > > * There are some latent issues with some code (e.g. alternatives, patch= ing, insn) > > code being instrumentable even though this is unsound, and depending= on > > compiler choices this can happen to be fine or can result in boot-ti= me > > failures (I saw lockups when I started trying to add KCSAN for arm64= ). > > = > > While this isn't just a KCSAN problem, fixing that requires some fai= rly > > significant rework to a bunch of code, and until that's done we're o= n very > > shaky ground. So I'd like to make KCSAN depend on EXPERT for now. > > = > > I had an initial stab at fixing some of that, e.g. > > = > > https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git/log/?= h=3Darm64/patching/rework > > Joey has started looking into this too. > = > Thanks for your information,=A0 I don't know about this. As your say, we = could > add a depend on EXPERT > = > for now and more explanation into changlog. So what I gather arm64's final select line may look like: select HAVE_ARCH_KCSAN if EXPERT && (CC_IS_GCC || CLANG_VERSION >=3D 12000= 0) > > = > > * When I last tested, for simple boots I would get frequent KCSAN splat= s for a > > few common issues, and those drowned out all other reports. > > = > > One case was manipulation of thread_info::flags, which Thomas Gleixn= er has > > queued some fixes at: > > = > > https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/log/?h= =3Dcore/entry > > There were some other common failures, e.g. accesses to task_struct:= :on_cpu, > > and I hadn't had the chance to investigate/fix those, beyond a (like= ly > > unsound) hack: > > = > > https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git/commi= t/?h=3Darm64/kcsan&id=3D4fe9d6c2ef85257d80291086e4514eaaebd3504e > > = > > It would be good if we could identify the most frequent problems (e.= g. those > > that will occur when just booting) before we enable this generally, = to avoid > > everyone racing to report/fix those as soon as we enable the feature. > > = > > When you tested, did KCSAN flag anything beyond the selftests? > = > Yes, there are some KCSAN reports, but this is not only exist on arm64, I > saw=A0 owner->on_cpu warning > = > on x86 too, eg, we also hack to disable it via data_race. > = > Reported by Kernel Concurrency Sanitizer on: > CPU: 7 PID: 2530 Comm: syz-executor.11 Not tainted 5.10.0+ #113 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubunt= u1.104/01/2014 > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > BUG: KCSAN: data-race in rwsem_spin_on_owner+0xf4/0x180 > = > race at unknown origin, with read to 0xffff9767d3becfac of 4 bytes by tas= k 18119 on cpu 0: > rwsem_spin_on_owner+0xf4/0x180 > rwsem_optimistic_spin+0x48/0x480 > rwsem_down_read_slowpath+0x4a0/0x670 > down_read+0x69/0x190 > process_vm_rw+0x41e/0x840 > __x64_sys_process_vm_writev+0x76/0x90 > do_syscall_64+0x37/0x50 > entry_SYSCALL_64_after_hwframe+0x44/0xa9 I think fixing data races is not a pre-requisite for arch-enablement. Some are slowly being addressed (and others aren't -- syzbot has a list of >200 data races that I try to moderate and fix some or forward those that I think will get fixed). I expect the most frequent issues will be the same on arm64 as they are on x86. I actually have a "fix" for the data race in rwsem_spin_on_owner, that also shows where the other racing access comes from... which reminds me: https://lkml.kernel.org/r/20211202101238.33546-1-elver@google.com Thanks, -- Marco _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel