Re: Not enough CXL HDM decoders in pass through host bridges (sort of)

Linux CXL
 help / color / mirror / Atom feed

From: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: <linux-cxl@vger.kernel.org>
Subject: Re: Not enough CXL HDM decoders in pass through host bridges (sort of)
Date: Fri, 17 Feb 2023 11:33:20 +0000	[thread overview]
Message-ID: <20230217113320.000045fa@Huawei.com> (raw)
In-Reply-To: <63eeaa26a4ef1_32d61294cf@dwillia2-xfh.jf.intel.com.notmuch>

On Thu, 16 Feb 2023 14:11:50 -0800
Dan Williams <dan.j.williams@intel.com> wrote:

> Jonathan Cameron wrote:
> > Hi Dan,
> > 
> > I've finally been adding support for multiple HDM decoders in QEMU
> > (need to implement the address decode at EPs, but have it working at
> > HB and switch USP)
> > 
> > Whilst testing ran into a corner case on the kernel side of things.
> > 
> >            Host Bridge
> >                 |
> >             Root Port
> >                 |
> >            Switch USP
> >                 |   
> >         ________|_________       
> >        |                  |
> >      DSP0               DSP1
> >        |                  |
> >      Type3              Type3
> > 
> > 
> > Previously I'd been testing this with either an interleave across the two
> > Type3 devices or with just one in use (as I only had one HDM decoder in the SW USP
> > so couldn't handle anything else)
> > 
> > Now I have lots of decoders, I added a simple test with two regions. One on each of
> > the type 3 devices.
> > 
> > It fails on the second region (with a "no decoder available error") because...
> > 
> > It's trying to find an HDM decoder in the host bridge and the fake one used
> > for a pass through decoder is 'already in use' by the first region.
> > 
> > I'm not sure we can simply skip the check in this case because cxld->region
> > can only point at one region at a time and I haven't though through
> > the impacts of that for a pass through decoder.  The cynic in me says
> > that if this HB had more RPs, we'd have a maximum of 32 decoders, so
> > just fake 32 of them instead of 1 and not worry about it any more but
> > that feels like a hack and probably has side effects.
> > 
> > I thought I'd raise the issue first and think about a solution afterwards
> > (and secretly hope it is fixed before I get to it ;)).  
> 
> Hmm, in the case of no decoders in a host-bridge the root-decoders are
> effectively mirrored at that level. I.e. root-decoders already support
> the property of hosting multiple regions in a decoder so perhaps
> tunneling them in this case would be a better model then establishing a
> unique "passthrough decoder".

Agreed. Something along those lines would make sense.

> 
> > Obviously I can avoid the whole thing by adding an RP and hence have
> > actually decoders to use up on the host bridge which does fine for
> > testing my QEMU work, but we still need to fix this up - unless I'm missing
> > some subtlety.  
> 
> I just find it difficult to believe that someone will build decoder-less
> host bridges. The moment that you have multiple CFMWS windows to account
> for ram vs pmem, type-2 vs type-3, etc... then it mandates multiple
> distinct decoders at each level.

Not sure the spec does mandate them in the host bridge even with multiple CFMWS windows.

My reading is the assumption is that if the host is routing to the HB
(via a CFMWS) there is no need to decode at all. Everything just gets
forwarded through to the single RP. Interleave / type etc requires decoders
at the next level (either EP or switch USP)

> 
> At a minimum I think this problem can be solved at leisurely place
> unless someone can point to a non-QEMU example of such a platform to
> raise the urgency.

Sure - low priority for now (though I bet I get QEMU bug reports much
like we did before I made default to not have the decoders for this case
in the first place).
Could just make the kernel deal with a single RP HB that doesn't do pass
through.  Last time I checked that doesn't work yet (which is only reason
we have pass through decoders in QEMU).

> 
> > There is another question of whether we should make some effort to conserve
> > decoders - so if we can just expand an existing one to cover a wider range?
> > We can't do that after commit, but maybe there is a dance we could do to
> > soft commit a bunch of regions, then hard commit them as a set.
> > HDM decoders may be a precious resource on some systems even though CXL 3.0
> > let's host bridges have 32 of them.  One to tackle only when it's a real problem
> > though.  
> 
> How would that work a practice? A switch can only target multiple
> downstreams for a given address range if they are interleaved, and if
> they are interleaved it's a single region.
>

Simplest one is the topology above, but they've decided not to interleave
for whatever reason - perhaps different levels of resilience to errors.
OS decides to put the two devices next to each other in HPA space so only
one decoder is needed in the HB (two in the switch of course).  I haven't
checked if we support a BIOS having decided to do this.  I'm not that
worried about doing it in the OS as the HB probably has enough decoder to
avoid it.  Maybe we need t solve that long term, but let's wait for someone
to scream about it.

Jonathan

next prev parent reply	other threads:[~2023-02-17 11:33 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-16 18:30 Not enough CXL HDM decoders in pass through host bridges (sort of) Jonathan Cameron
2023-02-16 22:11 ` Dan Williams
2023-02-17 11:33   ` Jonathan Cameron [this message]
2023-02-17 14:29     ` Jonathan Cameron

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230217113320.000045fa@Huawei.com \
    --to=jonathan.cameron@huawei.com \
    --cc=dan.j.williams@intel.com \
    --cc=linux-cxl@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox