From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B162AC433EF for ; Mon, 7 Mar 2022 07:33:35 +0000 (UTC) Received: from localhost ([::1]:48104 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nR7s6-0001pc-TR for qemu-devel@archiver.kernel.org; Mon, 07 Mar 2022 02:33:34 -0500 Received: from eggs.gnu.org ([209.51.188.92]:36440) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nR7pa-0008CH-W8 for qemu-devel@nongnu.org; Mon, 07 Mar 2022 02:30:59 -0500 Received: from [2607:f8b0:4864:20::436] (port=37798 helo=mail-pf1-x436.google.com) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nR7pY-00051G-AC for qemu-devel@nongnu.org; Mon, 07 Mar 2022 02:30:58 -0500 Received: by mail-pf1-x436.google.com with SMTP id t5so12996216pfg.4 for ; Sun, 06 Mar 2022 23:30:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=kdeU+sjY6VX/2oE75mtjm4Y8EaSVUCRuD/mTH+0DUKE=; b=Gz/psIc9SqsuB36SvkJesXClyv/GZuiy3AoYH8SqbXNDtnEBdhSYSCDyrhk3K75PIn u5f17am4wVCeoMEKWd1ez/5WAK90iB4XG5+sVuYyNHsYZgBW6VMoaNpqHhDr6i5JkJTM /76bh+ScIzQnqjKMgOl7BwOXLUyilU0QJj+URHuhY0Gn9I5mFRn6t5w26SgdLLHERvkr L5ne+1+PjM2jTEuzrF1Zr4dLlWkYiybygAdCkXyv6ZuDTrCnPXWgJP2W0sD/Nb/l1yM+ kytJ8esuyu3wDpHMSom01vq048WmNHhwlFiejEGi1CB9M1ojZ1J0lKA08yXaD5aMk918 dHIQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=kdeU+sjY6VX/2oE75mtjm4Y8EaSVUCRuD/mTH+0DUKE=; b=w7JuKzlyjbdnNqCOi70jJBDIMYqYYiH/wgtTDRmOUjelH86DuY0suKv4o9LCLBNbts OASpP6cM66UeveFCkfR/NolIs5CzqbkdHFQAacd/cIVF17A5JMU115xdYpZqA653Icg+ gQn5W51yaLbPpY7WlJpheWqz2Gl56150qfQa2NyM3BR716rNZpRij0J/8r5XCuFMettA DbaS8e07sLnsQjx9IenNCjW6vwoocDsOvpidR1Cya74r4fJRthIP4MPFZqflFb/xpsvB zcrnBavk/MfH6CMmOa0CHxsvtdYN1oopO8r5xNg9g9Dtt6sWnFa/gZHEiQ1gJc8ofOQw gsDw== X-Gm-Message-State: AOAM531np2d09cZlywG0hAz6V4CeozMIhvFWUKEVFei4UQ0sU92niHMz 0J028noLGDaqEI0lv9mTQOFBobuOZNVhiw== X-Google-Smtp-Source: ABdhPJwvm/raXQGHYt5rq5IF1yL8HbVP3kbtlK1FPpOzpIys7qQqLTtvT+AAx3TWjym+RgG1FtHmKg== X-Received: by 2002:a05:6a00:1950:b0:4f6:369b:4c4d with SMTP id s16-20020a056a00195000b004f6369b4c4dmr11385753pfk.79.1646638253280; Sun, 06 Mar 2022 23:30:53 -0800 (PST) Received: from mail-pl1-f171.google.com (mail-pl1-f171.google.com. [209.85.214.171]) by smtp.gmail.com with ESMTPSA id f6-20020a056a00228600b004f709f5f3c1sm1520730pfe.28.2022.03.06.23.30.52 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 06 Mar 2022 23:30:52 -0800 (PST) Received: by mail-pl1-f171.google.com with SMTP id s1so12984806plg.12; Sun, 06 Mar 2022 23:30:52 -0800 (PST) X-Received: by 2002:a17:902:d48b:b0:151:a5a0:e117 with SMTP id c11-20020a170902d48b00b00151a5a0e117mr10814166plg.119.1646638251769; Sun, 06 Mar 2022 23:30:51 -0800 (PST) MIME-Version: 1.0 References: <20220216154839.1024927-1-cmuellner@linux.com> <20220216154839.1024927-3-cmuellner@linux.com> <2c0fcafe-868a-ba7a-bf42-3a4ed2df19ba@iscas.ac.cn> In-Reply-To: From: Frank Chang Date: Mon, 7 Mar 2022 15:30:40 +0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v4 2/2] target/riscv: Enable Zicbo[m,z,p] instructions To: =?UTF-8?Q?Christoph_M=C3=BCllner?= Content-Type: multipart/alternative; boundary="0000000000001202fa05d99bd865" X-Host-Lookup-Failed: Reverse DNS lookup failed for 2607:f8b0:4864:20::436 (failed) Received-SPF: pass client-ip=2607:f8b0:4864:20::436; envelope-from=frank.chang@sifive.com; helo=mail-pf1-x436.google.com X-Spam_score_int: -6 X-Spam_score: -0.7 X-Spam_bar: / X-Spam_report: (-0.7 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, PDS_HP_HELO_NORDNS=0.659, RCVD_IN_DNSWL_NONE=-0.0001, RDNS_NONE=0.793, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Weiwei Li , "open list:RISC-V" , Anup Patel , Bin Meng , Atish Patra , "qemu-devel@nongnu.org Developers" , Philipp Tomsich , Alistair Francis , Palmer Dabbelt , Richard Henderson , =?UTF-8?B?RnLDqWTDqXJpYyBQw6l0cm90?= Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" --0000000000001202fa05d99bd865 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Christoph M=C3=BCllner =E6=96=BC 2022=E5=B9=B42=E6=9C= =8817=E6=97=A5 =E9=80=B1=E5=9B=9B =E4=B8=8B=E5=8D=8812:00=E5=AF=AB=E9=81=93= =EF=BC=9A > > > On Thu, Feb 17, 2022 at 3:15 AM Weiwei Li wrote: > >> >> =E5=9C=A8 2022/2/16 =E4=B8=8B=E5=8D=8811:48, Christoph Muellner =E5=86= =99=E9=81=93: >> > diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c >> > index 39ffb883fc..04500fe352 100644 >> > --- a/target/riscv/cpu.c >> > +++ b/target/riscv/cpu.c >> > @@ -764,6 +764,10 @@ static Property riscv_cpu_properties[] =3D { >> > DEFINE_PROP_BOOL("Counters", RISCVCPU, cfg.ext_counters, true), >> > DEFINE_PROP_BOOL("Zifencei", RISCVCPU, cfg.ext_ifencei, true), >> > DEFINE_PROP_BOOL("Zicsr", RISCVCPU, cfg.ext_icsr, true), >> > + DEFINE_PROP_BOOL("zicbom", RISCVCPU, cfg.ext_icbom, true), >> > + DEFINE_PROP_BOOL("zicboz", RISCVCPU, cfg.ext_icboz, true), >> > + DEFINE_PROP_UINT16("cbom_blocksize", RISCVCPU, cfg.cbom_blocksize= , >> 64), >> > + DEFINE_PROP_UINT16("cboz_blocksize", RISCVCPU, cfg.cboz_blocksize= , >> 64), >> Why use two different cache block size here? Is there any new spec >> update for this? >> > > No, we are talking about the same specification. > > Section 2.7 states the following: > """ > The initial set of CMO extensions requires the following information to b= e > discovered by software: > * The size of the cache block for management and prefetch instructions > * The size of the cache block for zero instructions > * CBIE support at each privilege level > """ > > So at least the spec authors did differentiate between the two block size= s > as well. > > >> > DEFINE_PROP_BOOL("Zfh", RISCVCPU, cfg.ext_zfh, false), >> > DEFINE_PROP_BOOL("Zfhmin", RISCVCPU, cfg.ext_zfhmin, false), >> > DEFINE_PROP_BOOL("Zve32f", RISCVCPU, cfg.ext_zve32f, false), >> > + >> > +/* helper_zicbom_access >> > + * >> > + * Check access permissions (LOAD, STORE or FETCH as specified in >> section >> > + * 2.5.2 of the CMO specification) for Zicbom, raising either store >> > + * page-fault (non-virtualised) or store guest-page fault >> (virtualised). >> > + */ >> > +static void helper_zicbom_access(CPURISCVState *env, target_ulong >> address, >> > + uintptr_t ra) >> > +{ >> > + int ret; >> > + void* phost; >> > + int mmu_idx =3D cpu_mmu_index(env, false); >> > + >> > + /* Get the size of the cache block for management instructions. *= / >> > + RISCVCPU *cpu =3D env_archcpu(env); >> > + uint16_t cbomlen =3D cpu->cfg.cbom_blocksize; >> > + >> > + /* Mask off low-bits to align-down to the cache-block. */ >> > + address &=3D ~(cbomlen - 1); >> > + >> > + /* A cache-block management instruction is permitted to access >> > + * the specified cache block whenever a load instruction, store >> > + * instruction, or instruction fetch is permitted to access the >> > + * corresponding physical addresses. >> > + */ >> > + ret =3D probe_access_range_flags(env, address, cbomlen, >> MMU_DATA_LOAD, >> > + mmu_idx, true, &phost, ra); >> > + if (ret =3D=3D TLB_INVALID_MASK) >> > + ret =3D probe_access_range_flags(env, address, cbomlen, >> MMU_INST_FETCH, >> > + mmu_idx, true, &phost, ra); >> > + if (ret =3D=3D TLB_INVALID_MASK) >> > + probe_access_range_flags(env, address, cbomlen, MMU_DATA_STOR= E, >> > + mmu_idx, false, &phost, ra); >> > +} >> > + >> >> >> I think it's a little different here. Probe_access_range_flags may >> trigger different execptions for different access_type. For example: >> >> If the page for the address is executable and readable but not >> writable, and the access cannot pass the pmp check for all access_type, >> >> it may trigger access fault for load/fetch access, and trigger page >> fault for store access. >> > > Just to be clear: > The patch does not trigger any fault for LOAD or FETCH because nonfault i= s > set > to true (6th argument of probe_access_range_flags()). > Only the last call to probe_access_range_flags() raises an exception. > > Section 2.5.2 states the following: > """ > If access to the cache block is not permitted, a cache-block management > instruction raises a store page fault or store guest-page fault exception > if address translation does not permit any > access or raises a store access fault exception otherwise. > """ > > In your scenario we have (1...allowed; 0...not allowed): > * read: perm:1, pmp:0 > * fetch: perm:1: pmp:0 > * write: perm:0, pmp:0 > > Address translation would allow read and fetch access, but PMP blocks tha= t. > So the "does not permit any"-part is wrong, therefore we should raise a > store page fault. > > In fact, I can't predict what will happen, because the code in > target/riscv/cpu_helper.c does > not really prioritize page faults or PMP faults. it returns one of them, > once they are encountered. > Hi Christoph, May I ask what does "page faults or PMP faults are not prioritized" here mean? In target/riscv/cpu_helper.c, if "pmp_violation" flag is not set to true, page fault will be picked. So as long as the TRANSLATE_PMP_FAIL is returned, it will be indicated as a PMP fault. (The only exception I can see is that TRANSLATE_PMP_FAIL may be converted to TRANSLATE_G_STAGE_FAIL if it's the second stage translation and PMP fault on PTE entry's PA.) As the "final PA" is checked only after the page table is walked, shouldn't the "pmp_violation" flag only be set after all the translation accesses are checked and granted? Regards, Frank Chang > > In order to model this properly, we would have to refactor cpu_helper.c t= o > separate page permissions > from PMP. However, that seems a bit out of scope for a Zicbo* support > patchset. > > > >> >> I think the final exception should be access fault instead of the page >> fault caused by probe_access_range_flags with MMU_DATA_STORE. >> >> Regards, >> >> Weiwei Li >> >> --0000000000001202fa05d99bd865 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Christoph M=C3=BCllner <cmuellner@linux.com> =E6=96=BC 2022=E5=B9= =B42=E6=9C=8817=E6=97=A5 =E9=80=B1=E5=9B=9B =E4=B8=8B=E5=8D=8812:00=E5=AF= =AB=E9=81=93=EF=BC=9A


On T= hu, Feb 17, 2022 at 3:15 AM Weiwei Li <liweiwei@iscas.ac.cn> wrote:

=E5=9C=A8 2022/2/16 =E4=B8=8B=E5=8D=8811:48, Christoph Muellner =E5=86=99= =E9=81=93:
> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> index 39ffb883fc..04500fe352 100644
> --- a/target/riscv/cpu.c
> +++ b/target/riscv/cpu.c
> @@ -764,6 +764,10 @@ static Property riscv_cpu_properties[] =3D {
>=C2=A0 =C2=A0 =C2=A0 =C2=A0DEFINE_PROP_BOOL("Counters", RISCV= CPU, cfg.ext_counters, true),
>=C2=A0 =C2=A0 =C2=A0 =C2=A0DEFINE_PROP_BOOL("Zifencei", RISCV= CPU, cfg.ext_ifencei, true),
>=C2=A0 =C2=A0 =C2=A0 =C2=A0DEFINE_PROP_BOOL("Zicsr", RISCVCPU= , cfg.ext_icsr, true),
> +=C2=A0 =C2=A0 DEFINE_PROP_BOOL("zicbom", RISCVCPU, cfg.ext_= icbom, true),
> +=C2=A0 =C2=A0 DEFINE_PROP_BOOL("zicboz", RISCVCPU, cfg.ext_= icboz, true),
> +=C2=A0 =C2=A0 DEFINE_PROP_UINT16("cbom_blocksize", RISCVCPU= , cfg.cbom_blocksize, 64),
> +=C2=A0 =C2=A0 DEFINE_PROP_UINT16("cboz_blocksize", RISCVCPU= , cfg.cboz_blocksize, 64),
Why use two different cache block size here? Is there any new spec
update for this?

No, we are talking abo= ut the same specification.

Section 2.7 states the = following:
"""
The initial set of CM= O extensions requires the following information to be discovered by softwar= e:
* The size of the cache block for management and prefetch instruction= s
* The size of the cache block for zero instructions
* CBIE support = at each privilege level
"""

So at least the spec authors did differentiate between the two block= sizes as well.
=C2=A0
>=C2=A0 =C2=A0 =C2=A0 =C2=A0DEFINE_PROP_BOOL("Zfh", RISCVCPU, = cfg.ext_zfh, false),
>=C2=A0 =C2=A0 =C2=A0 =C2=A0DEFINE_PROP_BOOL("Zfhmin", RISCVCP= U, cfg.ext_zfhmin, false),
>=C2=A0 =C2=A0 =C2=A0 =C2=A0DEFINE_PROP_BOOL("Zve32f", RISCVCP= U, cfg.ext_zve32f, false),
> +
> +/* helper_zicbom_access
> + *
> + * Check access permissions (LOAD, STORE or FETCH as specified in sec= tion
> + * 2.5.2 of the CMO specification) for Zicbom, raising either store > + * page-fault (non-virtualised) or store guest-page fault (virtualise= d).
> + */
> +static void helper_zicbom_access(CPURISCVState *env, target_ulong add= ress,
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0uintptr_t ra)
> +{
> +=C2=A0 =C2=A0 int ret;
> +=C2=A0 =C2=A0 void* phost;
> +=C2=A0 =C2=A0 int mmu_idx =3D cpu_mmu_index(env, false);
> +
> +=C2=A0 =C2=A0 /* Get the size of the cache block for management instr= uctions. */
> +=C2=A0 =C2=A0 RISCVCPU *cpu =3D env_archcpu(env);
> +=C2=A0 =C2=A0 uint16_t cbomlen =3D cpu->cfg.cbom_blocksize;
> +
> +=C2=A0 =C2=A0 /* Mask off low-bits to align-down to the cache-block. = */
> +=C2=A0 =C2=A0 address &=3D ~(cbomlen - 1);
> +
> +=C2=A0 =C2=A0 /* A cache-block management instruction is permitted to= access
> +=C2=A0 =C2=A0 =C2=A0* the specified cache block whenever a load instr= uction, store
> +=C2=A0 =C2=A0 =C2=A0* instruction, or instruction fetch is permitted = to access the
> +=C2=A0 =C2=A0 =C2=A0* corresponding physical addresses.
> +=C2=A0 =C2=A0 =C2=A0*/
> +=C2=A0 =C2=A0 ret =3D probe_access_range_flags(env, address, cbomlen,= MMU_DATA_LOAD,
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0mmu_idx, true, &= ;phost, ra);
> +=C2=A0 =C2=A0 if (ret =3D=3D TLB_INVALID_MASK)
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 ret =3D probe_access_range_flags(env, add= ress, cbomlen, MMU_INST_FETCH,
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0mmu_i= dx, true, &phost, ra);
> +=C2=A0 =C2=A0 if (ret =3D=3D TLB_INVALID_MASK)
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 probe_access_range_flags(env, address, cb= omlen, MMU_DATA_STORE,
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0mmu_idx, false, &phost= , ra);
> +}
> +


I think it's a little different here. Probe_access_range_flags may
trigger different execptions for different access_type. For example:

If=C2=A0 the page for the address=C2=A0 is executable and readable but not =
writable,=C2=A0 and the access cannot pass the pmp check for all access_typ= e,

it may trigger access fault for load/fetch access, and=C2=A0 trigger page <= br> fault for=C2=A0 store access.

Just to b= e clear:
The patch does not trigger any fault for LOAD or FETCH b= ecause nonfault=C2=A0is set
to true (6th argument of probe_access= _range_flags()).
Only the last call to probe_access_range_flags()= raises an exception.

Section 2.5.2 states the fol= lowing:
"""
If access to the cache b= lock is not permitted, a cache-block management
instruction raises a sto= re page fault or store guest-page fault exception if address translation do= es not permit any
access or raises a store access fault exception otherw= ise.
"""

In your scen= ario we have (1...allowed; 0...not allowed):
* read: perm:1, pmp:= 0
* fetch: perm:1: pmp:0
* write: perm:0, pmp:0

Address translation would allow read and fetch access, bu= t PMP blocks that.
So the "does not permit any"-part is= wrong, therefore we should raise a store page fault.

<= div>In fact, I can't predict what will happen, because the code in targ= et/riscv/cpu_helper.c does
not really prioritize page faults or P= MP faults. it returns one of them, once they are encountered.

Hi Christoph,

May I ask what does "page faults or PMP faults are not prioritized&qu= ot; here mean?

In target/riscv/cpu_helper.c, if &q= uot;pmp_violation" flag is not set to true, page fault will be picked.=
So as long as the=C2=A0TRANSLATE_PMP_FAIL is returned, it will b= e indicated as a PMP fault.
(The only exception I can see is that= =C2=A0TRANSLATE_PMP_FAIL may be converted to=C2=A0TRANSLATE_G_STAGE_FAIL
=C2=A0 if it's the second stage translation and PMP fault on PT= E entry's PA.)
As the "final PA" is checked only af= ter the page table is walked,
shouldn't the "pmp_vio= lation" flag only be set after all the translation accesses are checke= d and granted?

Regards,
Frank Chang
= =C2=A0

In order to model t= his properly, we would have to refactor cpu_helper.c to separate page permi= ssions
from PMP. However, that seems a bit out of scope for a Zic= bo* support patchset.

=C2=A0

I think the final exception should be access fault instead of the page
fault caused by probe_access_range_flags with MMU_DATA_STORE.

Regards,

Weiwei Li

--0000000000001202fa05d99bd865--