qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Jones <ajones@ventanamicro.com>
To: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Cc: qemu-devel@nongnu.org, qemu-riscv@nongnu.org,
	alistair.francis@wdc.com,  bmeng@tinylab.org,
	liwei1518@gmail.com, zhiwei_liu@linux.alibaba.com,
	 palmer@rivosinc.com, peter.maydell@linaro.org,
	Tomasz Jeznach <tjeznach@rivosinc.com>,
	Sebastien Boeuf <seb@rivosinc.com>
Subject: Re: [PATCH v8 03/12] hw/riscv: add RISC-V IOMMU base emulation
Date: Thu, 3 Oct 2024 15:31:46 +0200	[thread overview]
Message-ID: <20241003-bd98395898af6d83eeef0be4@orel> (raw)
In-Reply-To: <7c5f0fce-fc66-4fca-90be-aa2d4f7a4b04@ventanamicro.com>

On Thu, Oct 03, 2024 at 10:06:11AM GMT, Daniel Henrique Barboza wrote:
> 
> 
> On 10/3/24 6:26 AM, Andrew Jones wrote:
> > On Tue, Oct 01, 2024 at 10:02:58PM GMT, Daniel Henrique Barboza wrote:
> > ...
> > > +/*
> > > + * RISCV IOMMU Address Translation Lookup - Page Table Walk
> > > + *
> > > + * Note: Code is based on get_physical_address() from target/riscv/cpu_helper.c
> > > + * Both implementation can be merged into single helper function in future.
> > > + * Keeping them separate for now, as error reporting and flow specifics are
> > > + * sufficiently different for separate implementation.
> > > + *
> > > + * @s        : IOMMU Device State
> > > + * @ctx      : Translation context for device id and process address space id.
> > > + * @iotlb    : translation data: physical address and access mode.
> > > + * @return   : success or fault cause code.
> > > + */
> > > +static int riscv_iommu_spa_fetch(RISCVIOMMUState *s, RISCVIOMMUContext *ctx,
> > > +    IOMMUTLBEntry *iotlb)
> > > +{
> > > +    dma_addr_t addr, base;
> > > +    uint64_t satp, gatp, pte;
> > > +    bool en_s, en_g;
> > > +    struct {
> > > +        unsigned char step;
> > > +        unsigned char levels;
> > > +        unsigned char ptidxbits;
> > > +        unsigned char ptesize;
> > > +    } sc[2];
> > > +    /* Translation stage phase */
> > > +    enum {
> > > +        S_STAGE = 0,
> > > +        G_STAGE = 1,
> > > +    } pass;
> > > +    MemTxResult ret;
> > > +
> > > +    satp = get_field(ctx->satp, RISCV_IOMMU_ATP_MODE_FIELD);
> > > +    gatp = get_field(ctx->gatp, RISCV_IOMMU_ATP_MODE_FIELD);
> > > +
> > > +    en_s = satp != RISCV_IOMMU_DC_FSC_MODE_BARE;
> > > +    en_g = gatp != RISCV_IOMMU_DC_IOHGATP_MODE_BARE;
> > > +
> > > +    /*
> > > +     * Early check for MSI address match when IOVA == GPA. This check
> > > +     * is required to ensure MSI translation is applied in case
> > > +     * first-stage translation is set to BARE mode. In this case IOVA
> > > +     * provided is a valid GPA. Running translation through page walk
> > > +     * second stage translation will incorrectly try to translate GPA
> > > +     * to host physical page, likely hitting IOPF.
> > > +     */
> > 
> > Why was this comment expanded from the simple
> > 
> > "Early check for MSI address match when IOVA == GPA."
> > 
> > The comment is now incorrect since the check is required even when
> > first-stage translation is not BARE. I just skimmed the spec again trying
> > to figure out if the removal of '!en_s' is a hack or a fix, and I'm
> > inclined to say "fix", but it's an incomplete fix. I found a sentence that
> > says
> > 
> > "If the virtual memory scheme selected for first-stage is Bare but the
> > scheme for the second-stage is not Bare then the IOVA is a GPA."
> > 
> > which, in a way, defines a GPA to only be a GPA when second-stage is used
> > (and all MSI translation specifications refer to GPAs). However, maybe I
> > missed it, but I couldn't find any actual reason that the MSI table can't
> > be used when first-stage is not BARE and second-stage is (and, of course,
> > it makes no difference for single-stage translations to call IOVAs GPAs
> > or not).
> > 
> > Now, I also see
> > 
> > "If the virtual memory scheme selected for neither stage is Bare then the
> > IOVA is a VA. Two-stage address translation is in effect. The first-stage
> > translates the VA to a GPA and the second-stage translates the GPA to a
> > SPA."
> > 
> > in the spec, which means we should probably change the removal of '!en_s'
> > to '!(en_s && en_g)'. VFIO+irqbypass would still work with that and, when
> > Linux learns to support two-stage translation, we wouldn't incorrectly try
> > to check for a GVA in the MSI table.
> 
> Ok. It seems to me that we can't rely on the riscv-iommu spec alone to let
> us know how to detect if IOVA == GPA, given that one of the main usages
> we have ATM (VFIO irqbypass) will use GPAs with S-stage enabled.
> 
> (Note: shouldn't we open a bug against the riscv-iommu spec?)
> 

I can try writing a sentence or two for the spec to clarify this and send
a PR.

> I like the idea of ruling out the case where IOVA == VA since that is clear
> in the spec. We also know that MSI entries will always contains GPAs.
> 
> This is the change I'm going to make in v9:
> 
> 
>     /*
>      * Early check for MSI address match when IOVA == GPA to
>      * support VFIO+irqbypass. The riscv-iommu spec doesn't
>      * consider the case where a GPA can be produced by S-stage
>      * only, but we have real life examples like Linux VFIO that
>      * work that way. The spec alone does not provide a reliable
>      * way of detecting if IOVA == GPA.
>      *
>      * The spec is clear about what is a VA: "If the virtual
>      * memory scheme selected for neither stage is Bare then
>      * the IOVA is a VA", in our case "(en_s && en_g)". We also
>      * know that MSI tables will always hold GPAs.
>      *
>      * Thus the check consists of ruling out VAs and checking
>      * the MSI table.
>      */
>     if (!(en_s && en_g) && (iotlb->perm & IOMMU_WO) &&
>         riscv_iommu_msi_check(s, ctx, iotlb->iova)) {
>         iotlb->target_as = &s->trap_as;
>         iotlb->translated_addr = iotlb->iova;
>         iotlb->addr_mask = ~TARGET_PAGE_MASK;
>         return 0;
>     }

LGTM

Thanks,
drew

> 
> Tomasz, let me know if you have any opinions against it. I intend to send
> the v9 start of next week.
> 
> 
> Thanks,
> 
> 
> Daniel
> 
> 
> > 
> > Thanks,
> > drew


  reply	other threads:[~2024-10-03 13:32 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-02  1:02 [PATCH v8 00/12] riscv: QEMU RISC-V IOMMU Support Daniel Henrique Barboza
2024-10-02  1:02 ` [PATCH v8 01/12] exec/memtxattr: add process identifier to the transaction attributes Daniel Henrique Barboza
2024-10-02  1:02 ` [PATCH v8 02/12] hw/riscv: add riscv-iommu-bits.h Daniel Henrique Barboza
2024-10-02  1:02 ` [PATCH v8 03/12] hw/riscv: add RISC-V IOMMU base emulation Daniel Henrique Barboza
2024-10-03  9:26   ` Andrew Jones
2024-10-03 13:06     ` Daniel Henrique Barboza
2024-10-03 13:31       ` Andrew Jones [this message]
2024-10-03 18:36       ` Tomasz Jeznach
2024-10-04  8:33         ` Andrew Jones
2024-10-04 13:00           ` Daniel Henrique Barboza
2024-10-04 14:46             ` Tomasz Jeznach
2024-10-02  1:02 ` [PATCH v8 04/12] pci-ids.rst: add Red Hat pci-id for RISC-V IOMMU device Daniel Henrique Barboza
2024-10-02  1:03 ` [PATCH v8 05/12] hw/riscv: add riscv-iommu-pci reference device Daniel Henrique Barboza
2024-10-02  1:03 ` [PATCH v8 06/12] hw/riscv/virt.c: support for RISC-V IOMMU PCIDevice hotplug Daniel Henrique Barboza
2024-10-02  1:03 ` [PATCH v8 07/12] test/qtest: add riscv-iommu-pci tests Daniel Henrique Barboza
2024-10-02  1:03 ` [PATCH v8 08/12] hw/riscv/riscv-iommu: add Address Translation Cache (IOATC) Daniel Henrique Barboza
2024-10-02  1:03 ` [PATCH v8 09/12] hw/riscv/riscv-iommu: add ATS support Daniel Henrique Barboza
2024-10-02  1:03 ` [PATCH v8 10/12] hw/riscv/riscv-iommu: add DBG support Daniel Henrique Barboza
2024-10-02  1:03 ` [PATCH v8 11/12] qtest/riscv-iommu-test: add init queues test Daniel Henrique Barboza
2024-10-02  1:03 ` [PATCH v8 12/12] docs/specs: add riscv-iommu Daniel Henrique Barboza

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20241003-bd98395898af6d83eeef0be4@orel \
    --to=ajones@ventanamicro.com \
    --cc=alistair.francis@wdc.com \
    --cc=bmeng@tinylab.org \
    --cc=dbarboza@ventanamicro.com \
    --cc=liwei1518@gmail.com \
    --cc=palmer@rivosinc.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-riscv@nongnu.org \
    --cc=seb@rivosinc.com \
    --cc=tjeznach@rivosinc.com \
    --cc=zhiwei_liu@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).