Linux CXL
 help / color / mirror / Atom feed
From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: <linux-cxl@vger.kernel.org>, <linuxarm@huawei.com>
Subject: Re: Not enough CXL HDM decoders in pass through host bridges (sort of)
Date: Fri, 17 Feb 2023 14:29:01 +0000	[thread overview]
Message-ID: <20230217142901.00000ba3@huawei.com> (raw)
In-Reply-To: <20230217113320.000045fa@Huawei.com>

On Fri, 17 Feb 2023 11:33:20 +0000
Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:

> On Thu, 16 Feb 2023 14:11:50 -0800
> Dan Williams <dan.j.williams@intel.com> wrote:
> 
> > Jonathan Cameron wrote:  
> > > Hi Dan,
> > > 
> > > I've finally been adding support for multiple HDM decoders in QEMU
> > > (need to implement the address decode at EPs, but have it working at
> > > HB and switch USP)
> > > 
> > > Whilst testing ran into a corner case on the kernel side of things.
> > > 
> > >            Host Bridge
> > >                 |
> > >             Root Port
> > >                 |
> > >            Switch USP
> > >                 |   
> > >         ________|_________       
> > >        |                  |
> > >      DSP0               DSP1
> > >        |                  |
> > >      Type3              Type3

Note the same problem occurs with more than one region on a single type 3 device
if you have a passthrough decoder based HB upsteam (even if it's just a one rp
version with actual decoders).

Kind of obvious with hindsight but I didn't have code in place to test that
until today.


> > > 
> > > 
> > > Previously I'd been testing this with either an interleave across the two
> > > Type3 devices or with just one in use (as I only had one HDM decoder in the SW USP
> > > so couldn't handle anything else)
> > > 
> > > Now I have lots of decoders, I added a simple test with two regions. One on each of
> > > the type 3 devices.
> > > 
> > > It fails on the second region (with a "no decoder available error") because...
> > > 
> > > It's trying to find an HDM decoder in the host bridge and the fake one used
> > > for a pass through decoder is 'already in use' by the first region.
> > > 
> > > I'm not sure we can simply skip the check in this case because cxld->region
> > > can only point at one region at a time and I haven't though through
> > > the impacts of that for a pass through decoder.  The cynic in me says
> > > that if this HB had more RPs, we'd have a maximum of 32 decoders, so
> > > just fake 32 of them instead of 1 and not worry about it any more but
> > > that feels like a hack and probably has side effects.
> > > 
> > > I thought I'd raise the issue first and think about a solution afterwards
> > > (and secretly hope it is fixed before I get to it ;)).    
> > 
> > Hmm, in the case of no decoders in a host-bridge the root-decoders are
> > effectively mirrored at that level. I.e. root-decoders already support
> > the property of hosting multiple regions in a decoder so perhaps
> > tunneling them in this case would be a better model then establishing a
> > unique "passthrough decoder".  
> 
> Agreed. Something along those lines would make sense.
> 
> >   
> > > Obviously I can avoid the whole thing by adding an RP and hence have
> > > actually decoders to use up on the host bridge which does fine for
> > > testing my QEMU work, but we still need to fix this up - unless I'm missing
> > > some subtlety.    
> > 
> > I just find it difficult to believe that someone will build decoder-less
> > host bridges. The moment that you have multiple CFMWS windows to account
> > for ram vs pmem, type-2 vs type-3, etc... then it mandates multiple
> > distinct decoders at each level.  
> 
> Not sure the spec does mandate them in the host bridge even with multiple CFMWS windows.
> 
> My reading is the assumption is that if the host is routing to the HB
> (via a CFMWS) there is no need to decode at all. Everything just gets
> forwarded through to the single RP. Interleave / type etc requires decoders
> at the next level (either EP or switch USP)

Se figure 9-16 for a complex setup with no HDM decoders in the HB.
Admittedly this is still a single region so not definitive of what people might build.

Jonathan

> 
> > 
> > At a minimum I think this problem can be solved at leisurely place
> > unless someone can point to a non-QEMU example of such a platform to
> > raise the urgency.  
> 
> Sure - low priority for now (though I bet I get QEMU bug reports much
> like we did before I made default to not have the decoders for this case
> in the first place).
> Could just make the kernel deal with a single RP HB that doesn't do pass
> through.  Last time I checked that doesn't work yet (which is only reason
> we have pass through decoders in QEMU).
> 
> >   
> > > There is another question of whether we should make some effort to conserve
> > > decoders - so if we can just expand an existing one to cover a wider range?
> > > We can't do that after commit, but maybe there is a dance we could do to
> > > soft commit a bunch of regions, then hard commit them as a set.
> > > HDM decoders may be a precious resource on some systems even though CXL 3.0
> > > let's host bridges have 32 of them.  One to tackle only when it's a real problem
> > > though.    
> > 
> > How would that work a practice? A switch can only target multiple
> > downstreams for a given address range if they are interleaved, and if
> > they are interleaved it's a single region.
> >  
> 
> Simplest one is the topology above, but they've decided not to interleave
> for whatever reason - perhaps different levels of resilience to errors.
> OS decides to put the two devices next to each other in HPA space so only
> one decoder is needed in the HB (two in the switch of course).  I haven't
> checked if we support a BIOS having decided to do this.  I'm not that
> worried about doing it in the OS as the HB probably has enough decoder to
> avoid it.  Maybe we need t solve that long term, but let's wait for someone
> to scream about it.
> 
> Jonathan
> 
> 
> 


      reply	other threads:[~2023-02-17 14:29 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-16 18:30 Not enough CXL HDM decoders in pass through host bridges (sort of) Jonathan Cameron
2023-02-16 22:11 ` Dan Williams
2023-02-17 11:33   ` Jonathan Cameron
2023-02-17 14:29     ` Jonathan Cameron [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230217142901.00000ba3@huawei.com \
    --to=jonathan.cameron@huawei.com \
    --cc=dan.j.williams@intel.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linuxarm@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox