All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mostafa Saleh <smostafa@google.com>
To: Jean-Philippe Brucker <jean-philippe@linaro.org>
Cc: qemu-arm@nongnu.org, eric.auger@redhat.com,
	peter.maydell@linaro.org, qemu-devel@nongnu.org,
	alex.bennee@linaro.org, maz@kernel.org, nicolinc@nvidia.com,
	julien@xen.org, richard.henderson@linaro.org,
	marcin.juszkiewicz@linaro.org
Subject: Re: [PATCH v4 09/19] hw/arm/smmu-common: Rework TLB lookup for nesting
Date: Wed, 10 Jul 2024 08:39:33 +0000	[thread overview]
Message-ID: <Zo5Ixb3Li81fZPvQ@google.com> (raw)
In-Reply-To: <20240709171345.GC2189727@myrica>

Hi Jean,

On Tue, Jul 09, 2024 at 06:13:45PM +0100, Jean-Philippe Brucker wrote:
> On Tue, Jul 09, 2024 at 07:14:19AM +0000, Mostafa Saleh wrote:
> > Hi Jean,
> > 
> > On Thu, Jul 04, 2024 at 07:12:35PM +0100, Jean-Philippe Brucker wrote:
> > > On Mon, Jul 01, 2024 at 11:02:31AM +0000, Mostafa Saleh wrote:
> > > > In the next patch, combine_tlb() will be added which combines 2 TLB
> > > > entries into one for nested translations, which chooses the granule
> > > > and level from the smallest entry.
> > > > 
> > > > This means that with nested translation, an entry can be cached with
> > > > the granule of stage-2 and not stage-1.
> > > > 
> > > > However, currently, the lookup for an IOVA is done with input stage
> > > > granule, which is stage-1 for nested configuration, which will not
> > > > work with the above logic.
> > > > This patch reworks lookup in that case, so it falls back to stage-2
> > > > granule if no entry is found using stage-1 granule.
> > > 
> > > Why not initialize tt_combined to the minimum granule of stages 1 and 2?
> > > It looks like you introduced it for this. I'm wondering if we lookup the
> > > wrong IOVA if changing the granule size after the address is masked in
> > > smmu_translate()
> > 
> > I am not sure I fully understand, but I don’t think that would work as it is
> > not guaranteed that the minimum granule is the one that would be cached,
> > as we might hit block mappings.
> > 
> > The IOVA at first is masked with the first stage mask for the expected page
> > address, and the lookup logic would mask the address for each level look up,
> > so It should match the alignment of the cached page of that granule and level,
> > and as the combine logic is done with the aligned_addr it is guaranteed by
> > construction that it has to be aligned with stage-1.
> 
> I missed something, this is what I had in mind initially:
> 
> * s1 granule is 64k, s2 granule is 4k
> * the tlb already contains a translations for IOVA 0x30000, tg=4k
> * now we lookup IOVA 0x31000. Masked with the s1 granule, aligned_addr is
>   0x30000. Not found at first because lookup is with tg=64k, but then we
>   call smmu_iotlb_lookup_all_levels() again with the s2 granule and the
>   same IOVA, which returns the wrong translation

If the granules are s1=64k, s2=4k, the only way we get a cached entry as
(IOVA 0x30000, tg=4k) would be for s2 and level-3 as for level-2 it has
to be aligned with 0x200000

So when we look up for 0x31000, there is no entry for it anyway.

But I can see some problems here:
In case also s1 granule is 64k, s2 granule is 4k
- Translation A: 0x31000
- TLB is empty => PTW, entry s1 =  64k 0x30000, s2 = 4k, 0x30000 and
  the cached entry would be 0x30000,tg=4k as the combine logic also
  uses the aligned address
- Translation B: 0x31000 => also misses as the only cached entry
  is 0x30000, 4k

I think this is actually a bug and not just a TLB inefficiency, I need
to think more about it, but my initial thought is not to align the
iova until it’s used by a stage so it can use its granule.

> 
> But it's not actually possible, because if cfg->stage == SMMU_NESTED, then
> in smmu_translate() we end up with
> 
>     } else {
>         /* Stage2. */
>         tt_combined.granule_sz = cfg->s2cfg.granule_sz;
> 
> So I think the condition
> 
> 	(cfg->stage == SMMU_NESTED) && (cfg->s2cfg.granule_sz != tt->granule_sz)
> 
> in this patch is never true?
> 

Ah, that’s a bug, I will fix it, NESTED should use stage-1 granule.

> 
> Then the following scenario:
> 
> * s1 granule is 4k, s2 granule is 64k
> * we lookup IOVA A, miss. The translation gets cached with granule 4k
> * we lookup IOVA A again, but with tt->granule_sz = 64k so we'll
>   never find the entry?
> 
> 
> I guess we want to start the lookup with the smallest granule, and then if
> the s1 and s2 granules differ, retry with the other one. Or with
> SMMU_NESTED, start with the s1 granule and keep this patch to fallback to
> s2 granule, but without masking the IOVA in smmu_translate() (it will be
> masked correctly by smmu_iotlb_lookup_all_levels()).

Thanks for pointing that out, I will think more about it but I sense
that we would need to modify where we align the iova, for translation
and lookup.


Thanks,
Mostafa

> 
> Thanks,
> Jean
> 
> > 
> > Thanks,
> > Mostafa
> > 
> > > 
> > > Thanks,
> > > Jean
> > > 
> > > > 
> > > > Signed-off-by: Mostafa Saleh <smostafa@google.com>
> > > > ---
> > > >  hw/arm/smmu-common.c | 36 ++++++++++++++++++++++++++++++++++--
> > > >  1 file changed, 34 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
> > > > index 21982621c0..0840b5cffd 100644
> > > > --- a/hw/arm/smmu-common.c
> > > > +++ b/hw/arm/smmu-common.c
> > > > @@ -66,8 +66,10 @@ SMMUIOTLBKey smmu_get_iotlb_key(int asid, int vmid, uint64_t iova,
> > > >      return key;
> > > >  }
> > > >  
> > > > -SMMUTLBEntry *smmu_iotlb_lookup(SMMUState *bs, SMMUTransCfg *cfg,
> > > > -                                SMMUTransTableInfo *tt, hwaddr iova)
> > > > +static SMMUTLBEntry *smmu_iotlb_lookup_all_levels(SMMUState *bs,
> > > > +                                                  SMMUTransCfg *cfg,
> > > > +                                                  SMMUTransTableInfo *tt,
> > > > +                                                  hwaddr iova)
> > > >  {
> > > >      uint8_t tg = (tt->granule_sz - 10) / 2;
> > > >      uint8_t inputsize = 64 - tt->tsz;
> > > > @@ -88,6 +90,36 @@ SMMUTLBEntry *smmu_iotlb_lookup(SMMUState *bs, SMMUTransCfg *cfg,
> > > >          }
> > > >          level++;
> > > >      }
> > > > +    return entry;
> > > > +}
> > > > +
> > > > +/**
> > > > + * smmu_iotlb_lookup - Look up for a TLB entry.
> > > > + * @bs: SMMU state which includes the TLB instance
> > > > + * @cfg: Configuration of the translation
> > > > + * @tt: Translation table info (granule and tsz)
> > > > + * @iova: IOVA address to lookup
> > > > + *
> > > > + * returns a valid entry on success, otherwise NULL.
> > > > + * In case of nested translation, tt can be updated to include
> > > > + * the granule of the found entry as it might different from
> > > > + * the IOVA granule.
> > > > + */
> > > > +SMMUTLBEntry *smmu_iotlb_lookup(SMMUState *bs, SMMUTransCfg *cfg,
> > > > +                                SMMUTransTableInfo *tt, hwaddr iova)
> > > > +{
> > > > +    SMMUTLBEntry *entry = NULL;
> > > > +
> > > > +    entry = smmu_iotlb_lookup_all_levels(bs, cfg, tt, iova);
> > > > +    /*
> > > > +     * For nested translation also try the s2 granule, as the TLB will insert
> > > > +     * it if the size of s2 tlb entry was smaller.
> > > > +     */
> > > > +    if (!entry && (cfg->stage == SMMU_NESTED) &&
> > > > +        (cfg->s2cfg.granule_sz != tt->granule_sz)) {
> > > > +        tt->granule_sz = cfg->s2cfg.granule_sz;
> > > > +        entry = smmu_iotlb_lookup_all_levels(bs, cfg, tt, iova);
> > > > +    }
> > > >  
> > > >      if (entry) {
> > > >          cfg->iotlb_hits++;
> > > > -- 
> > > > 2.45.2.803.g4e1b14247a-goog
> > > > 

  reply	other threads:[~2024-07-10  8:39 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-01 11:02 [PATCH v4 00/19] SMMUv3 nested translation support Mostafa Saleh
2024-07-01 11:02 ` [PATCH v4 01/19] hw/arm/smmu-common: Add missing size check for stage-1 Mostafa Saleh
2024-07-04 17:55   ` Jean-Philippe Brucker
2024-07-01 11:02 ` [PATCH v4 02/19] hw/arm/smmu: Fix IPA for stage-2 events Mostafa Saleh
2024-07-04 17:56   ` Jean-Philippe Brucker
2024-07-01 11:02 ` [PATCH v4 03/19] hw/arm/smmuv3: Fix encoding of CLASS in events Mostafa Saleh
2024-07-04 18:02   ` Jean-Philippe Brucker
2024-07-09  7:10     ` Mostafa Saleh
2024-07-01 11:02 ` [PATCH v4 04/19] hw/arm/smmu: Use enum for SMMU stage Mostafa Saleh
2024-07-01 11:02 ` [PATCH v4 05/19] hw/arm/smmu: Split smmuv3_translate() Mostafa Saleh
2024-07-01 11:02 ` [PATCH v4 06/19] hw/arm/smmu: Consolidate ASID and VMID types Mostafa Saleh
2024-07-01 11:02 ` [PATCH v4 07/19] hw/arm/smmu: Introduce CACHED_ENTRY_TO_ADDR Mostafa Saleh
2024-07-04 18:04   ` Jean-Philippe Brucker
2024-07-08 13:27     ` Eric Auger
2024-07-01 11:02 ` [PATCH v4 08/19] hw/arm/smmuv3: Translate CD and TT using stage-2 table Mostafa Saleh
2024-07-04 18:08   ` Jean-Philippe Brucker
2024-07-08 14:54     ` Eric Auger
2024-07-09  7:12     ` Mostafa Saleh
2024-07-09 13:00       ` Jean-Philippe Brucker
2024-07-11 13:03         ` Mostafa Saleh
2024-07-01 11:02 ` [PATCH v4 09/19] hw/arm/smmu-common: Rework TLB lookup for nesting Mostafa Saleh
2024-07-04 18:12   ` Jean-Philippe Brucker
2024-07-09  7:14     ` Mostafa Saleh
2024-07-09 17:13       ` Jean-Philippe Brucker
2024-07-10  8:39         ` Mostafa Saleh [this message]
2024-07-01 11:02 ` [PATCH v4 10/19] hw/arm/smmu-common: Add support for nested TLB Mostafa Saleh
2024-07-04 18:13   ` Jean-Philippe Brucker
2024-07-08 15:06   ` Eric Auger
2024-07-01 11:02 ` [PATCH v4 11/19] hw/arm/smmu-common: Support nested translation Mostafa Saleh
2024-07-04 18:31   ` Jean-Philippe Brucker
2024-07-09  7:15     ` Mostafa Saleh
2024-07-08 15:19   ` Eric Auger
2024-07-09  7:18     ` Mostafa Saleh
2024-07-09  7:57       ` Eric Auger
2024-07-01 11:02 ` [PATCH v4 12/19] hw/arm/smmu: Support nesting in smmuv3_range_inval() Mostafa Saleh
2024-07-04 18:32   ` Jean-Philippe Brucker
2024-07-09  7:19     ` Mostafa Saleh
2024-07-01 11:02 ` [PATCH v4 13/19] hw/arm/smmu: Introduce smmu_iotlb_inv_asid_vmid Mostafa Saleh
2024-07-04 18:33   ` Jean-Philippe Brucker
2024-07-08 16:08     ` Eric Auger
2024-07-01 11:02 ` [PATCH v4 14/19] hw/arm/smmu: Support nesting in the rest of commands Mostafa Saleh
2024-07-04 18:34   ` Jean-Philippe Brucker
2024-07-08 16:15   ` Eric Auger
2024-07-01 11:02 ` [PATCH v4 15/19] hw/arm/smmuv3: Support nested SMMUs in smmuv3_notify_iova() Mostafa Saleh
2024-07-04 18:35   ` Jean-Philippe Brucker
2024-07-09  7:20     ` Mostafa Saleh
2024-07-01 11:02 ` [PATCH v4 16/19] hw/arm/smmuv3: Handle translation faults according to SMMUPTWEventInfo Mostafa Saleh
2024-07-04 18:36   ` Jean-Philippe Brucker
2024-07-08 16:30     ` Eric Auger
2024-07-01 11:02 ` [PATCH v4 17/19] hw/arm/smmuv3: Support and advertise nesting Mostafa Saleh
2024-07-04 18:36   ` Jean-Philippe Brucker
2024-07-08 16:51   ` Eric Auger
2024-07-01 11:02 ` [PATCH v4 18/19] hw/arm/smmuv3: Advertise S2FWB Mostafa Saleh
2024-07-04 18:36   ` Jean-Philippe Brucker
2024-07-08 17:09     ` Eric Auger
2024-07-09  7:22       ` Mostafa Saleh
2024-07-09  7:21     ` Mostafa Saleh
2024-07-01 11:02 ` [PATCH v4 19/19] hw/arm/smmu: Refactor SMMU OAS Mostafa Saleh
2024-07-04 18:37   ` Jean-Philippe Brucker
2024-07-08 17:30 ` [PATCH v4 00/19] SMMUv3 nested translation support Eric Auger
2024-07-08 18:13   ` Mostafa Saleh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zo5Ixb3Li81fZPvQ@google.com \
    --to=smostafa@google.com \
    --cc=alex.bennee@linaro.org \
    --cc=eric.auger@redhat.com \
    --cc=jean-philippe@linaro.org \
    --cc=julien@xen.org \
    --cc=marcin.juszkiewicz@linaro.org \
    --cc=maz@kernel.org \
    --cc=nicolinc@nvidia.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-arm@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=richard.henderson@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.