From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 40B45C83F17 for ; Wed, 30 Aug 2023 18:33:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234998AbjH3SdM (ORCPT ); Wed, 30 Aug 2023 14:33:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60736 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241601AbjH3HAS (ORCPT ); Wed, 30 Aug 2023 03:00:18 -0400 Received: from esa.microchip.iphmx.com (esa.microchip.iphmx.com [68.232.154.123]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 11C171A3; Wed, 30 Aug 2023 00:00:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=microchip.com; i=@microchip.com; q=dns/txt; s=mchp; t=1693378815; x=1724914815; h=date:from:to:cc:subject:message-id:references: mime-version:in-reply-to; bh=Mvg/nUDtYhEADCoBEFukMvDaTm/pr0NPS2b3VR6z3RM=; b=WjyZGU+XgyG9qhZTcgB8fizMjjN1agKm5S2SuuZ1kv18+92/aJWJ61wW SdCLdMGwEXU4jlRL5MREljKt+higARoxObIbrdR306UUPFVSlHie72NyW 6nlZfCMQYkawWY0yoklFaj7d8GrnNpuXuDQwc/KcdYhQGjnZHwjgFKHDV OQRRkG2pXNjhoIc0Ra8xItOI65eUv5mc8vsFuP3+o1ZKutipv8E5A/EBz 7JkenMtFpGRiSM+q06lDEpZAUXdVHLMZNWPHcTYo8AP6oldpnFG75C2m4 mPxfz0npNOj3s9Cm25ub2i8wSeIEKOPLJzGAJNjWBsgZZcRwBR///qiqs w==; X-IronPort-AV: E=Sophos;i="6.02,212,1688454000"; d="asc'?scan'208";a="169013177" X-Amp-Result: UNKNOWN X-Amp-Original-Verdict: FILE UNKNOWN Received: from unknown (HELO email.microchip.com) ([170.129.1.10]) by esa6.microchip.iphmx.com with ESMTP/TLS/AES256-SHA256; 30 Aug 2023 00:00:15 -0700 Received: from chn-vm-ex04.mchp-main.com (10.10.85.152) by chn-vm-ex01.mchp-main.com (10.10.85.143) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.21; Wed, 30 Aug 2023 00:00:03 -0700 Received: from wendy (10.10.85.11) by chn-vm-ex04.mchp-main.com (10.10.85.152) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.21 via Frontend Transport; Wed, 30 Aug 2023 00:00:01 -0700 Date: Wed, 30 Aug 2023 07:59:19 +0100 From: Conor Dooley To: "Wang, Xiao W" CC: Anup Patel , "paul.walmsley@sifive.com" , "palmer@dabbelt.com" , "aou@eecs.berkeley.edu" , "ardb@kernel.org" , "Li, Haicheng" , "linux-riscv@lists.infradead.org" , "linux-efi@vger.kernel.org" , "linux-kernel@vger.kernel.org" Subject: Re: [PATCH] RISC-V: Optimize bitops with Zbb extension Message-ID: <20230830-breeze-washboard-ef496d5c9d5a@wendy> References: <20230806024715.3061589-1-xiao.w.wang@intel.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="BxCw0DgMJ3ArtR6u" Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-efi@vger.kernel.org --BxCw0DgMJ3ArtR6u Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Aug 30, 2023 at 06:14:12AM +0000, Wang, Xiao W wrote: > Hi, >=20 > > -----Original Message----- > > From: Anup Patel > > Sent: Tuesday, August 29, 2023 7:08 PM > > To: Wang, Xiao W > > Cc: paul.walmsley@sifive.com; palmer@dabbelt.com; > > aou@eecs.berkeley.edu; ardb@kernel.org; Li, Haicheng > > ; linux-riscv@lists.infradead.org; linux- > > efi@vger.kernel.org; linux-kernel@vger.kernel.org > > Subject: Re: [PATCH] RISC-V: Optimize bitops with Zbb extension > >=20 > > On Sun, Aug 6, 2023 at 8:09=E2=80=AFAM Xiao Wang wrote: > > > > > > This patch leverages the alternative mechanism to dynamically optimize > > > bitops (including __ffs, __fls, ffs, fls) with Zbb instructions. When > > > Zbb ext is not supported by the runtime CPU, legacy implementation is > > > used. If Zbb is supported, then the optimized variants will be select= ed > > > via alternative patching. > > > > > > The legacy bitops support is taken from the generic C implementation = as > > > fallback. > > > > > > If the parameter is a build-time constant, we leverage compiler built= in to > > > calculate the result directly, this approach is inspired by x86 bitops > > > implementation. > > > > > > EFI stub runs before the kernel, so alternative mechanism should not = be > > > used there, this patch introduces a macro EFI_NO_ALTERNATIVE for this > > > purpose. > >=20 > > I am getting the following compile error with this patch: > >=20 > > GEN Makefile > > UPD include/config/kernel.release > > UPD include/generated/utsrelease.h > > CC kernel/bounds.s > > In file included from /home/anup/Work/riscv- > > test/linux/include/linux/bitmap.h:9, > > from > > /home/anup/Work/riscv-test/linux/arch/riscv/include/asm/cpufeature.h:9, > > from > > /home/anup/Work/riscv-test/linux/arch/riscv/include/asm/hwcap.h:90, >=20 >=20 > It looks there's a cyclic header including, which leads to this build err= or. > I checked https://github.com/kvm-riscv/linux/tree/master and > https://github.com/torvalds/linux/tree/master, but I don't see > "asm/cpufeature.h" is included in asm/hwcap.h:90, maybe I miss something, > could you help point me to the repo/branch I should work on? =46rom MAINTAINERS: RISC-V ARCHITECTURE ... T: git git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux.git The for-next branch there is what you should be basing work on top of. AFAICT, you've made bitops.h include hwcap.h while cpufeature.h includes both bitops.h (indirectly) and hwcap.h. Hope that helps, Conor. > > from > > /home/anup/Work/riscv-test/linux/arch/riscv/include/asm/bitops.h:26, > > from > > /home/anup/Work/riscv-test/linux/include/linux/bitops.h:68, > > from /home/anup/Work/riscv-test/linux/include/linux/lo= g2.h:12, > > from /home/anup/Work/riscv-test/linux/kernel/bounds.c:= 13: > > /home/anup/Work/riscv-test/linux/include/linux/find.h: In function > > 'find_next_bit': > > /home/anup/Work/riscv-test/linux/include/linux/find.h:64:30: error: > > implicit declaration of function '__ffs' > > [-Werror=3Dimplicit-function-declaration] > > 64 | return val ? __ffs(val) : size; > >=20 > > Regards, > > Anup > >=20 > >=20 > > > > > > Signed-off-by: Xiao Wang > > > --- > > > arch/riscv/include/asm/bitops.h | 266 ++++++++++++++++++++++++= +- > > > drivers/firmware/efi/libstub/Makefile | 2 +- > > > 2 files changed, 264 insertions(+), 4 deletions(-) > > > > > > diff --git a/arch/riscv/include/asm/bitops.h > > b/arch/riscv/include/asm/bitops.h > > > index 3540b690944b..f727f6489cd5 100644 > > > --- a/arch/riscv/include/asm/bitops.h > > > +++ b/arch/riscv/include/asm/bitops.h > > > @@ -15,13 +15,273 @@ > > > #include > > > #include > > > > > > +#if !defined(CONFIG_RISCV_ISA_ZBB) || defined(EFI_NO_ALTERNATIVE) > > > #include > > > -#include > > > -#include > > > #include > > > +#include > > > +#include > > > + > > > +#else > > > +#include > > > +#include > > > + > > > +#if (BITS_PER_LONG =3D=3D 64) > > > +#define CTZW "ctzw " > > > +#define CLZW "clzw " > > > +#elif (BITS_PER_LONG =3D=3D 32) > > > +#define CTZW "ctz " > > > +#define CLZW "clz " > > > +#else > > > +#error "Unexpected BITS_PER_LONG" > > > +#endif > > > + > > > +static __always_inline unsigned long variable__ffs(unsigned long wor= d) > > > +{ > > > + int num; > > > + > > > + asm_volatile_goto( > > > + ALTERNATIVE("j %l[legacy]", "nop", 0, RISCV_ISA_EXT_Z= BB, 1) > > > + : : : : legacy); > > > + > > > + asm volatile ( > > > + ".option push\n" > > > + ".option arch,+zbb\n" > > > + "ctz %0, %1\n" > > > + ".option pop\n" > > > + : "=3Dr" (word) : "r" (word) :); > > > + > > > + return word; > > > + > > > +legacy: > > > + num =3D 0; > > > +#if BITS_PER_LONG =3D=3D 64 > > > + if ((word & 0xffffffff) =3D=3D 0) { > > > + num +=3D 32; > > > + word >>=3D 32; > > > + } > > > +#endif > > > + if ((word & 0xffff) =3D=3D 0) { > > > + num +=3D 16; > > > + word >>=3D 16; > > > + } > > > + if ((word & 0xff) =3D=3D 0) { > > > + num +=3D 8; > > > + word >>=3D 8; > > > + } > > > + if ((word & 0xf) =3D=3D 0) { > > > + num +=3D 4; > > > + word >>=3D 4; > > > + } > > > + if ((word & 0x3) =3D=3D 0) { > > > + num +=3D 2; > > > + word >>=3D 2; > > > + } > > > + if ((word & 0x1) =3D=3D 0) > > > + num +=3D 1; > > > + return num; > > > +} > > > + > > > +/** > > > + * __ffs - find first set bit in a long word > > > + * @word: The word to search > > > + * > > > + * Undefined if no set bit exists, so code should check against 0 fi= rst. > > > + */ > > > +#define __ffs(word) \ > > > + (__builtin_constant_p(word) ? \ > > > + (unsigned long)__builtin_ctzl(word) : \ > > > + variable__ffs(word)) > > > + > [...] --BxCw0DgMJ3ArtR6u Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQRh246EGq/8RLhDjO14tDGHoIJi0gUCZO7oxwAKCRB4tDGHoIJi 0ozlAQDIl5mI7ovdiMJADRIobwVL0Wabj7o88Y28I+1sghdKfAEAx17o0kYzHC1p TWhYbjU4Mgdz6pbu5yyaDwqyyD44qw0= =MI7l -----END PGP SIGNATURE----- --BxCw0DgMJ3ArtR6u--