* Not enough CXL HDM decoders in pass through host bridges (sort of)
@ 2023-02-16 18:30 Jonathan Cameron
2023-02-16 22:11 ` Dan Williams
0 siblings, 1 reply; 4+ messages in thread
From: Jonathan Cameron @ 2023-02-16 18:30 UTC (permalink / raw)
To: Dan Williams, linux-cxl
Hi Dan,
I've finally been adding support for multiple HDM decoders in QEMU
(need to implement the address decode at EPs, but have it working at
HB and switch USP)
Whilst testing ran into a corner case on the kernel side of things.
Host Bridge
|
Root Port
|
Switch USP
|
________|_________
| |
DSP0 DSP1
| |
Type3 Type3
Previously I'd been testing this with either an interleave across the two
Type3 devices or with just one in use (as I only had one HDM decoder in the SW USP
so couldn't handle anything else)
Now I have lots of decoders, I added a simple test with two regions. One on each of
the type 3 devices.
It fails on the second region (with a "no decoder available error") because...
It's trying to find an HDM decoder in the host bridge and the fake one used
for a pass through decoder is 'already in use' by the first region.
I'm not sure we can simply skip the check in this case because cxld->region
can only point at one region at a time and I haven't though through
the impacts of that for a pass through decoder. The cynic in me says
that if this HB had more RPs, we'd have a maximum of 32 decoders, so
just fake 32 of them instead of 1 and not worry about it any more but
that feels like a hack and probably has side effects.
I thought I'd raise the issue first and think about a solution afterwards
(and secretly hope it is fixed before I get to it ;)).
Obviously I can avoid the whole thing by adding an RP and hence have
actually decoders to use up on the host bridge which does fine for
testing my QEMU work, but we still need to fix this up - unless I'm missing
some subtlety.
There is another question of whether we should make some effort to conserve
decoders - so if we can just expand an existing one to cover a wider range?
We can't do that after commit, but maybe there is a dance we could do to
soft commit a bunch of regions, then hard commit them as a set.
HDM decoders may be a precious resource on some systems even though CXL 3.0
let's host bridges have 32 of them. One to tackle only when it's a real problem
though.
Jonathan
^ permalink raw reply [flat|nested] 4+ messages in thread* RE: Not enough CXL HDM decoders in pass through host bridges (sort of)
2023-02-16 18:30 Not enough CXL HDM decoders in pass through host bridges (sort of) Jonathan Cameron
@ 2023-02-16 22:11 ` Dan Williams
2023-02-17 11:33 ` Jonathan Cameron
0 siblings, 1 reply; 4+ messages in thread
From: Dan Williams @ 2023-02-16 22:11 UTC (permalink / raw)
To: Jonathan Cameron, Dan Williams, linux-cxl
Jonathan Cameron wrote:
> Hi Dan,
>
> I've finally been adding support for multiple HDM decoders in QEMU
> (need to implement the address decode at EPs, but have it working at
> HB and switch USP)
>
> Whilst testing ran into a corner case on the kernel side of things.
>
> Host Bridge
> |
> Root Port
> |
> Switch USP
> |
> ________|_________
> | |
> DSP0 DSP1
> | |
> Type3 Type3
>
>
> Previously I'd been testing this with either an interleave across the two
> Type3 devices or with just one in use (as I only had one HDM decoder in the SW USP
> so couldn't handle anything else)
>
> Now I have lots of decoders, I added a simple test with two regions. One on each of
> the type 3 devices.
>
> It fails on the second region (with a "no decoder available error") because...
>
> It's trying to find an HDM decoder in the host bridge and the fake one used
> for a pass through decoder is 'already in use' by the first region.
>
> I'm not sure we can simply skip the check in this case because cxld->region
> can only point at one region at a time and I haven't though through
> the impacts of that for a pass through decoder. The cynic in me says
> that if this HB had more RPs, we'd have a maximum of 32 decoders, so
> just fake 32 of them instead of 1 and not worry about it any more but
> that feels like a hack and probably has side effects.
>
> I thought I'd raise the issue first and think about a solution afterwards
> (and secretly hope it is fixed before I get to it ;)).
Hmm, in the case of no decoders in a host-bridge the root-decoders are
effectively mirrored at that level. I.e. root-decoders already support
the property of hosting multiple regions in a decoder so perhaps
tunneling them in this case would be a better model then establishing a
unique "passthrough decoder".
> Obviously I can avoid the whole thing by adding an RP and hence have
> actually decoders to use up on the host bridge which does fine for
> testing my QEMU work, but we still need to fix this up - unless I'm missing
> some subtlety.
I just find it difficult to believe that someone will build decoder-less
host bridges. The moment that you have multiple CFMWS windows to account
for ram vs pmem, type-2 vs type-3, etc... then it mandates multiple
distinct decoders at each level.
At a minimum I think this problem can be solved at leisurely place
unless someone can point to a non-QEMU example of such a platform to
raise the urgency.
> There is another question of whether we should make some effort to conserve
> decoders - so if we can just expand an existing one to cover a wider range?
> We can't do that after commit, but maybe there is a dance we could do to
> soft commit a bunch of regions, then hard commit them as a set.
> HDM decoders may be a precious resource on some systems even though CXL 3.0
> let's host bridges have 32 of them. One to tackle only when it's a real problem
> though.
How would that work a practice? A switch can only target multiple
downstreams for a given address range if they are interleaved, and if
they are interleaved it's a single region.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Not enough CXL HDM decoders in pass through host bridges (sort of)
2023-02-16 22:11 ` Dan Williams
@ 2023-02-17 11:33 ` Jonathan Cameron
2023-02-17 14:29 ` Jonathan Cameron
0 siblings, 1 reply; 4+ messages in thread
From: Jonathan Cameron @ 2023-02-17 11:33 UTC (permalink / raw)
To: Dan Williams; +Cc: linux-cxl
On Thu, 16 Feb 2023 14:11:50 -0800
Dan Williams <dan.j.williams@intel.com> wrote:
> Jonathan Cameron wrote:
> > Hi Dan,
> >
> > I've finally been adding support for multiple HDM decoders in QEMU
> > (need to implement the address decode at EPs, but have it working at
> > HB and switch USP)
> >
> > Whilst testing ran into a corner case on the kernel side of things.
> >
> > Host Bridge
> > |
> > Root Port
> > |
> > Switch USP
> > |
> > ________|_________
> > | |
> > DSP0 DSP1
> > | |
> > Type3 Type3
> >
> >
> > Previously I'd been testing this with either an interleave across the two
> > Type3 devices or with just one in use (as I only had one HDM decoder in the SW USP
> > so couldn't handle anything else)
> >
> > Now I have lots of decoders, I added a simple test with two regions. One on each of
> > the type 3 devices.
> >
> > It fails on the second region (with a "no decoder available error") because...
> >
> > It's trying to find an HDM decoder in the host bridge and the fake one used
> > for a pass through decoder is 'already in use' by the first region.
> >
> > I'm not sure we can simply skip the check in this case because cxld->region
> > can only point at one region at a time and I haven't though through
> > the impacts of that for a pass through decoder. The cynic in me says
> > that if this HB had more RPs, we'd have a maximum of 32 decoders, so
> > just fake 32 of them instead of 1 and not worry about it any more but
> > that feels like a hack and probably has side effects.
> >
> > I thought I'd raise the issue first and think about a solution afterwards
> > (and secretly hope it is fixed before I get to it ;)).
>
> Hmm, in the case of no decoders in a host-bridge the root-decoders are
> effectively mirrored at that level. I.e. root-decoders already support
> the property of hosting multiple regions in a decoder so perhaps
> tunneling them in this case would be a better model then establishing a
> unique "passthrough decoder".
Agreed. Something along those lines would make sense.
>
> > Obviously I can avoid the whole thing by adding an RP and hence have
> > actually decoders to use up on the host bridge which does fine for
> > testing my QEMU work, but we still need to fix this up - unless I'm missing
> > some subtlety.
>
> I just find it difficult to believe that someone will build decoder-less
> host bridges. The moment that you have multiple CFMWS windows to account
> for ram vs pmem, type-2 vs type-3, etc... then it mandates multiple
> distinct decoders at each level.
Not sure the spec does mandate them in the host bridge even with multiple CFMWS windows.
My reading is the assumption is that if the host is routing to the HB
(via a CFMWS) there is no need to decode at all. Everything just gets
forwarded through to the single RP. Interleave / type etc requires decoders
at the next level (either EP or switch USP)
>
> At a minimum I think this problem can be solved at leisurely place
> unless someone can point to a non-QEMU example of such a platform to
> raise the urgency.
Sure - low priority for now (though I bet I get QEMU bug reports much
like we did before I made default to not have the decoders for this case
in the first place).
Could just make the kernel deal with a single RP HB that doesn't do pass
through. Last time I checked that doesn't work yet (which is only reason
we have pass through decoders in QEMU).
>
> > There is another question of whether we should make some effort to conserve
> > decoders - so if we can just expand an existing one to cover a wider range?
> > We can't do that after commit, but maybe there is a dance we could do to
> > soft commit a bunch of regions, then hard commit them as a set.
> > HDM decoders may be a precious resource on some systems even though CXL 3.0
> > let's host bridges have 32 of them. One to tackle only when it's a real problem
> > though.
>
> How would that work a practice? A switch can only target multiple
> downstreams for a given address range if they are interleaved, and if
> they are interleaved it's a single region.
>
Simplest one is the topology above, but they've decided not to interleave
for whatever reason - perhaps different levels of resilience to errors.
OS decides to put the two devices next to each other in HPA space so only
one decoder is needed in the HB (two in the switch of course). I haven't
checked if we support a BIOS having decided to do this. I'm not that
worried about doing it in the OS as the HB probably has enough decoder to
avoid it. Maybe we need t solve that long term, but let's wait for someone
to scream about it.
Jonathan
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Not enough CXL HDM decoders in pass through host bridges (sort of)
2023-02-17 11:33 ` Jonathan Cameron
@ 2023-02-17 14:29 ` Jonathan Cameron
0 siblings, 0 replies; 4+ messages in thread
From: Jonathan Cameron @ 2023-02-17 14:29 UTC (permalink / raw)
To: Dan Williams; +Cc: linux-cxl, linuxarm
On Fri, 17 Feb 2023 11:33:20 +0000
Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:
> On Thu, 16 Feb 2023 14:11:50 -0800
> Dan Williams <dan.j.williams@intel.com> wrote:
>
> > Jonathan Cameron wrote:
> > > Hi Dan,
> > >
> > > I've finally been adding support for multiple HDM decoders in QEMU
> > > (need to implement the address decode at EPs, but have it working at
> > > HB and switch USP)
> > >
> > > Whilst testing ran into a corner case on the kernel side of things.
> > >
> > > Host Bridge
> > > |
> > > Root Port
> > > |
> > > Switch USP
> > > |
> > > ________|_________
> > > | |
> > > DSP0 DSP1
> > > | |
> > > Type3 Type3
Note the same problem occurs with more than one region on a single type 3 device
if you have a passthrough decoder based HB upsteam (even if it's just a one rp
version with actual decoders).
Kind of obvious with hindsight but I didn't have code in place to test that
until today.
> > >
> > >
> > > Previously I'd been testing this with either an interleave across the two
> > > Type3 devices or with just one in use (as I only had one HDM decoder in the SW USP
> > > so couldn't handle anything else)
> > >
> > > Now I have lots of decoders, I added a simple test with two regions. One on each of
> > > the type 3 devices.
> > >
> > > It fails on the second region (with a "no decoder available error") because...
> > >
> > > It's trying to find an HDM decoder in the host bridge and the fake one used
> > > for a pass through decoder is 'already in use' by the first region.
> > >
> > > I'm not sure we can simply skip the check in this case because cxld->region
> > > can only point at one region at a time and I haven't though through
> > > the impacts of that for a pass through decoder. The cynic in me says
> > > that if this HB had more RPs, we'd have a maximum of 32 decoders, so
> > > just fake 32 of them instead of 1 and not worry about it any more but
> > > that feels like a hack and probably has side effects.
> > >
> > > I thought I'd raise the issue first and think about a solution afterwards
> > > (and secretly hope it is fixed before I get to it ;)).
> >
> > Hmm, in the case of no decoders in a host-bridge the root-decoders are
> > effectively mirrored at that level. I.e. root-decoders already support
> > the property of hosting multiple regions in a decoder so perhaps
> > tunneling them in this case would be a better model then establishing a
> > unique "passthrough decoder".
>
> Agreed. Something along those lines would make sense.
>
> >
> > > Obviously I can avoid the whole thing by adding an RP and hence have
> > > actually decoders to use up on the host bridge which does fine for
> > > testing my QEMU work, but we still need to fix this up - unless I'm missing
> > > some subtlety.
> >
> > I just find it difficult to believe that someone will build decoder-less
> > host bridges. The moment that you have multiple CFMWS windows to account
> > for ram vs pmem, type-2 vs type-3, etc... then it mandates multiple
> > distinct decoders at each level.
>
> Not sure the spec does mandate them in the host bridge even with multiple CFMWS windows.
>
> My reading is the assumption is that if the host is routing to the HB
> (via a CFMWS) there is no need to decode at all. Everything just gets
> forwarded through to the single RP. Interleave / type etc requires decoders
> at the next level (either EP or switch USP)
Se figure 9-16 for a complex setup with no HDM decoders in the HB.
Admittedly this is still a single region so not definitive of what people might build.
Jonathan
>
> >
> > At a minimum I think this problem can be solved at leisurely place
> > unless someone can point to a non-QEMU example of such a platform to
> > raise the urgency.
>
> Sure - low priority for now (though I bet I get QEMU bug reports much
> like we did before I made default to not have the decoders for this case
> in the first place).
> Could just make the kernel deal with a single RP HB that doesn't do pass
> through. Last time I checked that doesn't work yet (which is only reason
> we have pass through decoders in QEMU).
>
> >
> > > There is another question of whether we should make some effort to conserve
> > > decoders - so if we can just expand an existing one to cover a wider range?
> > > We can't do that after commit, but maybe there is a dance we could do to
> > > soft commit a bunch of regions, then hard commit them as a set.
> > > HDM decoders may be a precious resource on some systems even though CXL 3.0
> > > let's host bridges have 32 of them. One to tackle only when it's a real problem
> > > though.
> >
> > How would that work a practice? A switch can only target multiple
> > downstreams for a given address range if they are interleaved, and if
> > they are interleaved it's a single region.
> >
>
> Simplest one is the topology above, but they've decided not to interleave
> for whatever reason - perhaps different levels of resilience to errors.
> OS decides to put the two devices next to each other in HPA space so only
> one decoder is needed in the HB (two in the switch of course). I haven't
> checked if we support a BIOS having decided to do this. I'm not that
> worried about doing it in the OS as the HB probably has enough decoder to
> avoid it. Maybe we need t solve that long term, but let's wait for someone
> to scream about it.
>
> Jonathan
>
>
>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2023-02-17 14:29 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-02-16 18:30 Not enough CXL HDM decoders in pass through host bridges (sort of) Jonathan Cameron
2023-02-16 22:11 ` Dan Williams
2023-02-17 11:33 ` Jonathan Cameron
2023-02-17 14:29 ` Jonathan Cameron
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox