From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 64DF1C433EF for ; Sun, 3 Apr 2022 17:41:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:Content-Type: Content-Transfer-Encoding:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:In-Reply-To:From:References:Cc:To:Subject: MIME-Version:Date:Message-ID:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=Jivm49LfcrdhLt0hFD1rqCYBXT6kOfM8tRLe4w5NsLk=; b=glXjrfqs8KqfyN vhg84ARPBeGRpXlrlxYZfTinBEWLerO7vl0iwnc+F6s7yt9Ly7pkg//OLP882NW2msJPXXFv1Zojj FCXLFDmszUSxVGAXp3jGRf4fgJjFwObow4rvEw8SN4QEx0jxUG78Lszrg9HCwz6X6TWYSeQ3FWIqz PFw0j/z/guO2U13ZVJEtz0OOmFYjiOiwO0bSq0TjPl5VXLMr4sPwKstmt2Os1MQtbf/5sBGA06M1U fzir56x6WTXDv1vWj36W9OqEfbxy/XH7jLP6hpAS/nFOWxerbCt9bxlvb+SYnApm7bMrefMYU2l2y YUzeDO4yloZRyf9o/mlw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1nb4D7-00C1zo-Fx; Sun, 03 Apr 2022 17:40:21 +0000 Received: from mail-pf1-x42f.google.com ([2607:f8b0:4864:20::42f]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1nb4D4-00C1zN-KD for linux-arm-kernel@lists.infradead.org; Sun, 03 Apr 2022 17:40:20 +0000 Received: by mail-pf1-x42f.google.com with SMTP id bo5so6942897pfb.4 for ; Sun, 03 Apr 2022 10:40:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:date:mime-version:user-agent:subject:content-language:to :cc:references:from:in-reply-to:content-transfer-encoding; bh=r14zp06TFYUz+HGr+WGPyeJlAIncpVh0iIXgkiJ2aD4=; b=c5ouUgtcQeOOMgqKkhTAgUG/cjpMdSYlu6QbCbSPgx0d6mX6XNYscwTQcEkRAC0OSB MjQId8MVmFon8tCPgkeEf0cINy+gB+ocOLhfbyWUIgVwWHShN/ZFEqVt/lBtOzmr3Ebx 4GUv+Y2wiZfhAL/u5Igirct1q4H8MYshQvh7gZw01q4xLyEr0FSnB3n6fqSDYj76D5rW eQOWddl42fIk9+dwakC+CiowmHgqzHew4OnCdiHnZca4QNxBJ97/cKNXgKDzhm17uEq8 Lucx6ZMjwys1IK1UkiaD31BiBXoZ9YUwhHdnl8h/8nVrpNXMqhujdVL3MhTE4h4vKFZO N/xA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent:subject :content-language:to:cc:references:from:in-reply-to :content-transfer-encoding; bh=r14zp06TFYUz+HGr+WGPyeJlAIncpVh0iIXgkiJ2aD4=; b=IfAxCkKcXWxGfB0+2NQWGqaQmSPbwEi74CwCucrEGmaubWdxg1ZI7GA934ZBfXLSSw AVjqssT08TOnUa8eqq345n5xoyNEiF6C1kvz26o9aValczJ7syr5UrHLAsj5T07Cpf4l lBv6J7XEj2HFVyUWHn4Tg1HenFFcZ8BKNRpW9jF5OWpXMh81zND/BR2PpGszAVCJ0nC1 qltBQ6q/rbRD7nZt/FsjygV3gewYRQyJXWyxkR/YdlrcPG+SP6SwuLrcvi+59emMZEwu bFeXoztbLBdVUtHwkDuVrYEmX2puh3vxxA1R0ne93fZLgcvsn8FU/YBp1Fn5gJY2cFMd zjZA== X-Gm-Message-State: AOAM530hSe+3ACo9FLvU4YvlMwDG0E9K4Tr33Dv/a6cxiMtiSFLZbDYl 0PC4HxX51P9he2lY2y4rCI9+/7uZ69M= X-Google-Smtp-Source: ABdhPJzR4RSqQJftBRD5ishJ8GKRNbmPZVIiZA+Zw00K66Iwv3ZBHFMXdogVGGrW7AApIgTazwhUxQ== X-Received: by 2002:a65:6943:0:b0:376:333b:1025 with SMTP id w3-20020a656943000000b00376333b1025mr23085768pgq.164.1649007613415; Sun, 03 Apr 2022 10:40:13 -0700 (PDT) Received: from [192.168.219.17] (ip72-219-184-175.oc.oc.cox.net. [72.219.184.175]) by smtp.gmail.com with ESMTPSA id l67-20020a633e46000000b003986e01e982sm7933773pga.67.2022.04.03.10.40.11 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 03 Apr 2022 10:40:12 -0700 (PDT) Message-ID: <6bf0a0b6-99ac-1555-31d5-ce74c8497dfa@gmail.com> Date: Sun, 3 Apr 2022 10:40:11 -0700 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.7.0 Subject: Re: [PATCH] arm64/io: Remind compiler that there is a memory side effect Content-Language: en-US To: Andrew Pinski , Mark Rutland Cc: Jeremy Linton , GCC Mailing List , f.fainelli@gmail.com, maz@kernel.org, marcan@marcan.st, LKML , Catalin Marinas , will@kernel.org, linux-arm-kernel@lists.infradead.org References: <20220401164406.61583-1-jeremy.linton@arm.com> From: Doug Berger In-Reply-To: X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220403_104018_720115_66856318 X-CRM114-Status: GOOD ( 42.14 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On 4/3/2022 12:36 AM, Andrew Pinski wrote: > On Fri, Apr 1, 2022 at 10:24 AM Mark Rutland via Gcc wrote: >> >> Hi Jeremy, >> >> Thanks for raising this. >> >> On Fri, Apr 01, 2022 at 11:44:06AM -0500, Jeremy Linton wrote: >>> The relaxed variants of read/write macros are only declared >>> as `asm volatile()` which forces the compiler to generate the >>> instruction in the code path as intended. The only problem >>> is that it doesn't also tell the compiler that there may >>> be memory side effects. Meaning that if a function is comprised >>> entirely of relaxed io operations, the compiler may think that >>> it only has register side effects and doesn't need to be called. >> >> As I mentioned on a private mail, I don't think that reasoning above is >> correct, and I think this is a miscompilation (i.e. a compiler bug). >> >> The important thing is that any `asm volatile` may have a side effects >> generally outside of memory or GPRs, and whether the assembly contains a memory >> load/store is immaterial. We should not need to add a memory clobber in order >> to retain the volatile semantic. >> >> See: >> >> https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#Volatile >> >> ... and consider the x86 example that reads rdtsc, or an arm64 sequence like: >> >> | void do_sysreg_thing(void) >> | { >> | unsigned long tmp; >> | >> | tmp = read_sysreg(some_reg); >> | tmp |= SOME_BIT; >> | write_sysreg(some_reg); >> | } >> >> ... where there's no memory that we should need to hazard against. >> >> This patch might workaround the issue, but I don't believe it is a correct fix. I agree with Mark that this patch is an attempt to work around a bug in the GCC 12 compiler. > > It might not be the most restricted fix but it is a fix. > The best fix is to tell that you are writing to that location of memory. > volatile asm does not do what you think it does. > You didn't read further down about memory clobbers: > https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#Clobbers-and-Scratch-Registers > Specifically this part: > The "memory" clobber tells the compiler that the assembly code > performs memory reads or writes to items other than those listed in > the input and output operands I agree that volatile asm does not do what I think it should do in this case, but it appears to me that it does not do what the documentation states that it should do, and that is the bug in GCC 12. My interpretation of the referenced documentation is that the volatile qualifier of the asm keyword should prevent the GCC optimizer from performing certain optimizations. Of specific relevance in this scenario is the optimizers that "sometimes discard asm statements if they determine there is no need for the output variables." The clobbers tell the compiler about side effects or dependencies that the asm block may have that could be relevant to code outside the asm block so that proper functionality can be preserved and the optimizer can still do a good job. The functions in this patch do access memory (well technically registers...) and therefore adding the "memory" clobber is not "wrong", but the read variants of these functions also access memory so adding the "memory" clobber to them would be equally appropriate (or inappropriate). This would not affect the functionality, but it is "heavy-handed" and can have an unnecessary effect on performance. The "memory" clobber indicates that memory is somehow affected by the asm block and therefore requires the compiler to flush data in working registers to memory before the block and reload values from memory after the block. A better solution is to communicate the side effects more precisely to avoid operations that can be determined to be unnecessary. In the case of these functions, the only address accessed is in register space. Accesses to register space can have all kinds of side effects, but these side effects are communicated to the compiler by declaring the addr formal parameter with the volatile and __iomem attributes. In this way it is clear to the compiler that any writes to addr by code before the start of the asm block must occur before entering the block and any accesses to addr by code after the block must occur after executing the block such that the use of the "memory" clobber is unnecessary. > >> >>> For an example function look at bcmgenet_enable_dma(), before the >>> relaxed variants were removed. When built with gcc12 the code >>> contains the asm blocks as expected, but then the function is >>> never called. >> >> So it sounds like this is a regression in GCC 12, which IIUC isn't released yet >> per: > > It is NOT a bug in GCC 12. Just you depended on behavior which > accidently worked in the cases you were looking at. GCC 12 did not > change in this area at all even. GCC 12 should not have changed in this area, but the evidence suggests that in fact the behavior has changed such that an asm volatile block can be discarded by an optimizer. This appears unintentional and is therefore a bug that should be corrected before release of the toolchain since it could potentially affect any asm volatile block in the Linux source. Regards, Doug > > Thanks, > Andrew Pinski > >> >> https://gcc.gnu.org/gcc-12/changes.html >> >> ... which says: >> >> | Note: GCC 12 has not been released yet >> >> Surely we can fix it prior to release? >> >> Thanks, >> Mark. >> >>> >>> Signed-off-by: Jeremy Linton >>> --- >>> arch/arm64/include/asm/io.h | 8 ++++---- >>> 1 file changed, 4 insertions(+), 4 deletions(-) >>> >>> diff --git a/arch/arm64/include/asm/io.h b/arch/arm64/include/asm/io.h >>> index 7fd836bea7eb..3cceda7948a0 100644 >>> --- a/arch/arm64/include/asm/io.h >>> +++ b/arch/arm64/include/asm/io.h >>> @@ -24,25 +24,25 @@ >>> #define __raw_writeb __raw_writeb >>> static inline void __raw_writeb(u8 val, volatile void __iomem *addr) >>> { >>> - asm volatile("strb %w0, [%1]" : : "rZ" (val), "r" (addr)); >>> + asm volatile("strb %w0, [%1]" : : "rZ" (val), "r" (addr) : "memory"); >>> } >>> >>> #define __raw_writew __raw_writew >>> static inline void __raw_writew(u16 val, volatile void __iomem *addr) >>> { >>> - asm volatile("strh %w0, [%1]" : : "rZ" (val), "r" (addr)); >>> + asm volatile("strh %w0, [%1]" : : "rZ" (val), "r" (addr) : "memory"); >>> } >>> >>> #define __raw_writel __raw_writel >>> static __always_inline void __raw_writel(u32 val, volatile void __iomem *addr) >>> { >>> - asm volatile("str %w0, [%1]" : : "rZ" (val), "r" (addr)); >>> + asm volatile("str %w0, [%1]" : : "rZ" (val), "r" (addr) : "memory"); >>> } >>> >>> #define __raw_writeq __raw_writeq >>> static inline void __raw_writeq(u64 val, volatile void __iomem *addr) >>> { >>> - asm volatile("str %x0, [%1]" : : "rZ" (val), "r" (addr)); >>> + asm volatile("str %x0, [%1]" : : "rZ" (val), "r" (addr) : "memory"); >>> } >>> >>> #define __raw_readb __raw_readb >>> -- >>> 2.35.1 >>> _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel