From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-wr1-f53.google.com (mail-wr1-f53.google.com [209.85.221.53]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B904F3C8708 for ; Thu, 19 Mar 2026 11:21:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.53 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773919296; cv=none; b=HaB6aIgz6/74h8CsgTiROiGuNXP6eebmRYK6hEMhEne3oosiWB15qpe0yZfT9K+pMTDvTZjfImWx53nhSPg+1VAdV/LfOiqFaU+jGNBjCUM1e63rn1CEX5fPez8RvLs/Jy3uRxOJLRRM0LfsvP5JisCsYPY+HZAZj5U76Vht6UU= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773919296; c=relaxed/simple; bh=An49fIF6GuHNSbm1QbzB1pd5PzuiJLh+80J4TVOHodk=; h=Date:From:To:Cc:Subject:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=dbsJJI4zI+pptGuMKWnW7Zny2Dq0Z03ltLRCyQvdaNBhIW3sYrfz71OuA5LuRNGBg5XZQKd4/xXl+/FnOcUe5g7LHUN2mZHfZFlUELn1vBmTe9PKPZJfDUBJq/6Ks8bHsEsGJs9DkOhgfQaiC7PGSN9eFzOY0XCQlOLwnKLvwpw= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=a/NukXsY; arc=none smtp.client-ip=209.85.221.53 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="a/NukXsY" Received: by mail-wr1-f53.google.com with SMTP id ffacd0b85a97d-439cd6b09f8so584197f8f.3 for ; Thu, 19 Mar 2026 04:21:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1773919293; x=1774524093; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=i7+7DBF2qb8aV+tcE0AvFniUX14B85L/HRE+y1Iy7WA=; b=a/NukXsYpHspcC1Y1vlS1X7CEdK6KOI0t4iN3Bbaje7nBhNuvdl2LLgQuaIec9ukgg Z5WsxJhRzZsYRZBv/aA+B1tbZDwu36EUgIkE18hkDi16gKovVapbOW4TfoZAgp47ywQC nkKH5uvbTaHI7IaoXKiJByw5SRiIfRneyoa7LdrlqAqHS2D1UpAwDe6vZi3W2gUUCS/e XkVUTSsXyszRoa3PcRuKaP4HlyXA5EB2O7JZJJRmx5q7DuJ6+6nvPn9ZJj/xNz/SMjYx 0Bq1G0pUNsvn975hAGHMCxR56W5rM+EBLeozXnrwcEn6R8i9Ez7+CadSEf1H+pK8vJlP r+Fg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1773919293; x=1774524093; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=i7+7DBF2qb8aV+tcE0AvFniUX14B85L/HRE+y1Iy7WA=; b=H/pODdSjdEv7L50eWoz5brMS4vkSPTEX/MOjMtHkQ7i+9ddc17HawxXvONOI1ypHpW /cMh2iEIRSHI1IUMGfsdVCE76QWimwfrBB129AJvRcXE8UU1zH4Rf5Op7wdAdKKqrQbs RXeXREboSi8LDoAmlEKvmFAPREJDFj1lIvNQiWynES/9OBrC/3zOsZYO66hnIOFpQJKF dQuEGy4L4GZUPbeyFig2bz8Z9lBJpDLX8cvqEIAKgJAZz6yw6/GZ8ZYMma1/+DdfJond ALwhEZEmCQrdP2vSi9/FquskRb2kEKifwYlldCKN5YE0D2WdgpoCqR/2n4H9XQFGtdz5 IOKg== X-Forwarded-Encrypted: i=1; AJvYcCWSCoTBiSk1tF+EfMy1C+C/tE9ehujpWfWUFLPKYbrfmllh0qYjK1X9qP/iHGs2YN/70mzdbRWDMUUw5Uo=@vger.kernel.org X-Gm-Message-State: AOJu0YyJp9pbJMVveslFPSgnF6o5fAh9CHkXj/Q6ab2GeZ2jc+V9znZn XS0lm9kERrAkOhxlRD0jab0pgG74PvnHdtyjguIyGA+7IOpNJKzh7tRy X-Gm-Gg: ATEYQzwA5VyhetHtGN3fMH7QnxPxV+wnrzKuhBcE/9vVHN3fHWbGE1bbubkNrZSfVkr iK6EUht0M0hFy4X8dlbQe80BC4gqu9itmjEaAeCkwUxeZpVJf7oNKQVdUwdImTK9p4FzLmuXdl4 TmAySQ3cmT4QBNBbFknR3V8raK4mNAhxgH360m2cVQcs5Y2mqL9AIYTXeY+Zil/joEqIE4/dgFe 9aVjAz+eNSiE6sVSxYLhDHlkS8h4pnWHVptobeHCXmA3p8G06Q76i5XxFvVN2Yg/gNmK5woGP+m ZmRSsVaxiK56RHncwEYKoULOW4uloA8e4qONxBtcAsdJxtFhAqxRfsp29wT1z4dsqMy4kXZTQPj Ej00V+8ZvayoQRt0AH7IPMOQEmQ9EGoU/QUKOTousdqxCwF2HtWA+g24WW9QAp5QVjYy/6gcURR bageWZh/0FoTQ8SdniRC2lqutbEmzgGvXC70DdJajd4VRkfZitWZm45D1TIA/07loH X-Received: by 2002:a05:6000:26d2:b0:43b:4757:cc5 with SMTP id ffacd0b85a97d-43b527a64a7mr11727071f8f.19.1773919292994; Thu, 19 Mar 2026 04:21:32 -0700 (PDT) Received: from pumpkin (82-69-66-36.dsl.in-addr.zen.co.uk. [82.69.66.36]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-43b518522d7sm13345516f8f.13.2026.03.19.04.21.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 19 Mar 2026 04:21:32 -0700 (PDT) Date: Thu, 19 Mar 2026 11:21:31 +0000 From: David Laight To: Uros Bizjak Cc: x86@kernel.org, linux-kernel@vger.kernel.org, "Peter Zijlstra (Intel)" , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" Subject: Re: [PATCH] x86/asm: Switch clflush alternatives to use %a address operand modifier Message-ID: <20260319112131.0ba42dc7@pumpkin> In-Reply-To: References: <20260318090831.501191-1-ubizjak@gmail.com> <20260318150315.6cff1844@pumpkin> <20260319102025.213b68aa@pumpkin> X-Mailer: Claws Mail 4.1.1 (GTK 3.24.38; arm-unknown-linux-gnueabihf) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On Thu, 19 Mar 2026 11:45:59 +0100 Uros Bizjak wrote: > On Thu, Mar 19, 2026 at 11:20=E2=80=AFAM David Laight > wrote: > > > > On Wed, 18 Mar 2026 16:45:28 +0100 > > Uros Bizjak wrote: > > =20 > > > On Wed, Mar 18, 2026 at 4:03=E2=80=AFPM David Laight > > > wrote: =20 > > > > > > > > On Wed, 18 Mar 2026 10:08:11 +0100 > > > > Uros Bizjak wrote: > > > > =20 > > > > > The inline asm used with alternative_input() specifies the address > > > > > operand for clflush with the "a" input operand constraint and > > > > > explicit "(%[addr])" dereference: > > > > > > > > > > "clflush (%[addr])", [addr] "a" (addr) > > > > > > > > > > This forces the pointer into %rax and manually encodes the memory > > > > > operand in the template. Instead, use the %a address operand > > > > > modifier and relax the constraint from "a" to "r": > > > > > > > > > > "clflush %a[addr]", [addr] "r" (addr) > > > > > > > > > > This lets the compiler choose the register while generating the > > > > > correct addressing mode. =20 > > > > > > > > Aren't these two independent changes? =20 > > > > > > I was hoping I can put a trivial "a" -> "r" change under the "also > > > ..." change. OTOH, let's change the summary to "x86/asm: Improve > > > clflush alternatives assembly", that will also handle your proposed > > > addition of "memory" clobber. > > > =20 > > > > %a saves you having to know how to write the memory reference for t= he > > > > architecture - so is the same as (%[addr]) (assuming att syntax). > > > > I think the assembler handles the one 'odd' case of (%rbp). =20 > > > > > > Yes, it does, and also fixes another 'odd' case of (%r13). > > > =20 > > > > Was there ever a reason for using "a" rather than "r" - it seems an > > > > unusual choice. =20 > > > > > > Probably just an oversight due to a follow-up __monitor() that wants > > > its operand in %rax. =20 > > > > Actually gcc can be quite bad are reverse tracking register requirement= s. =20 >=20 > This must be a very old GCC as I'm not aware of this deficiency. >=20 > --cut here-- > void foo (int a) > { > asm volatile ("# 1" : : "r" (a)); > asm volatile ("# 2" : : "a" (a)); > } >=20 > void bar (int a) > { > asm volatile ("# 1" : : "a" (a)); > asm volatile ("# 2" : : "a" (a)); > } > --cut here-- >=20 > foo: > movl %edi, %eax > # 1 > # 2 > ret >=20 > bar: > movl %edi, %eax > # 1 > # 2 > ret >=20 > Do you perhaps have a testcase to illustrate your claim? If you look at enough gcc output you'll see places where there are register moves that look like they could be removed by adjusting the register assignments. I'm pretty sure Linus has commented about that as well. Whether it can happen in this trivial case is another matter. Oh - I can't see anything in the gcc 15.2 doc that says that the order of 'asm volatile' statements can't get swapped. I'm also pretty sure that some older (possibly very much older) versions definitely would swap them over. There might have been a post from someone saying that 'it doesn't do that any more', but it isn't documented.=20 David >=20 > > So forcing 'addr' into %rax for the cflush might actually remove > > a register move before the monitor. > > Indeed, were it to pick a different register there will always be a > > extra register move. > > If the value is in a different register (eg from a function call) > > then you'll move the register move instruction - but there'll still > > be one. > > > > So I suspect this change can never improve the code. =20 >=20 > Of course, there will always be a register move in the above case, but > please look at [1]. >=20 > [1] https://claude.ai/share/cf559f66-dfcf-451a-8260-6f687aead052 >=20 > Uros.