* PCIe coherency in spec (was: [RFC PATCH 2/2] drm/ttm: downgrade cached to write_combined when snooping not available) [not found] ` <fd1d0a97-7075-4936-b58b-e99bab9afc58@app.fastmail.com> @ 2024-07-03 8:52 ` Jiaxun Yang 2024-07-03 21:08 ` Bjorn Helgaas 0 siblings, 1 reply; 6+ messages in thread From: Jiaxun Yang @ 2024-07-03 8:52 UTC (permalink / raw) To: Christian König, Icenowy Zheng, Huang Rui, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie, Daniel Vetter, bhelgaas Cc: dri-devel, linux-kernel, linux-pci 在2024年7月2日七月 下午6:03,Jiaxun Yang写道: > 在2024年7月2日七月 下午5:27,Christian König写道: >> Am 02.07.24 um 11:06 schrieb Icenowy Zheng: >>> [SNIP] However I don't think the definition of the AGP spec could apply on all >>> PCI(e) implementations. The AGP spec itself don't apply on >>> implementations that do not implement AGP (which is the most PCI(e) >>> implementations today), and it's not in the reference list of the PCIe >>> spec, so it does no help on this context. >> No, exactly that is not correct. >> >> See as I explained the No-Snoop extension to PCIe was created to help >> with AGP support and later merged into the base PCIe specification. >> >> So the AGP spec is now part of the PCIe spec. Hi Bjorn & linux-pci folks, It seems like we have some disputes on interpretation pf PCIe specification. We are seeking your expertise on the question: Does PCIe specification mandate Cache coherency via snoop? There are some further context in this thread [1]. [1]: https://lore.kernel.org/all/0db974d40cd8c5dcc723d43c328bac923e0fe33a.camel@icenowy.me/ Thanks - Jiaxun > > We don't really buy this theory. > > Keyword "AGP" doesn't appear in "PCI Express Base 4.0 Base Specification" even > once. > > If PCIe is a predecessor of AGP, where does AGP specific software interface like > AGP aperture goes? PCIe GPUs are only borrowing software concepts from AGP, > but they didn't inherit any hardware properties. > > [...] >> We seem to have a misunderstanding here, this is not a software issue. >> The hardware platform is considered broken by the hardware vendor! > > It's up to the specification text to define compliance means. So far as > per analysis > from Icenowy of PCIe specification text itself it's not prohibited. > >> >> In other words people have stitched together hardware in a way which is >> not supported by the creator of that hardware. >> >> So as long as you can't convince anybody from ARM or the RISC-V team or >> whoever created that hardware to confirm that the hardware actually >> works you won't get any support for that. > > Well we are trying to support them on our own in mainline, we are not asking > for any support. > > Thanks > - Jiaxun >> >> Regards, >> Christian. > > -- > - Jiaxun -- - Jiaxun ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: PCIe coherency in spec (was: [RFC PATCH 2/2] drm/ttm: downgrade cached to write_combined when snooping not available) 2024-07-03 8:52 ` PCIe coherency in spec (was: [RFC PATCH 2/2] drm/ttm: downgrade cached to write_combined when snooping not available) Jiaxun Yang @ 2024-07-03 21:08 ` Bjorn Helgaas 2024-07-04 2:00 ` Icenowy Zheng 0 siblings, 1 reply; 6+ messages in thread From: Bjorn Helgaas @ 2024-07-03 21:08 UTC (permalink / raw) To: Jiaxun Yang Cc: Christian König, Icenowy Zheng, Huang Rui, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie, Daniel Vetter, bhelgaas, dri-devel, linux-kernel, linux-pci On Wed, Jul 03, 2024 at 04:52:30PM +0800, Jiaxun Yang wrote: > 在2024年7月2日七月 下午6:03,Jiaxun Yang写道: > > 在2024年7月2日七月 下午5:27,Christian König写道: > >> Am 02.07.24 um 11:06 schrieb Icenowy Zheng: > >>> [SNIP] However I don't think the definition of the AGP spec could apply on all > >>> PCI(e) implementations. The AGP spec itself don't apply on > >>> implementations that do not implement AGP (which is the most PCI(e) > >>> implementations today), and it's not in the reference list of the PCIe > >>> spec, so it does no help on this context. > >> No, exactly that is not correct. > >> > >> See as I explained the No-Snoop extension to PCIe was created to help > >> with AGP support and later merged into the base PCIe specification. > >> > >> So the AGP spec is now part of the PCIe spec. > > Hi Bjorn & linux-pci folks, > > It seems like we have some disputes on interpretation pf PCIe > specification. > > We are seeking your expertise on the question: Does PCIe > specification mandate Cache coherency via snoop? I'm not qualified to opine on this. I'd say it's a question for the PCI SIG protocol workgroup. https://forum.pcisig.com/ is a place to start. Bjorn ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: PCIe coherency in spec (was: [RFC PATCH 2/2] drm/ttm: downgrade cached to write_combined when snooping not available) 2024-07-03 21:08 ` Bjorn Helgaas @ 2024-07-04 2:00 ` Icenowy Zheng 2024-07-04 6:11 ` Christoph Hellwig 0 siblings, 1 reply; 6+ messages in thread From: Icenowy Zheng @ 2024-07-04 2:00 UTC (permalink / raw) To: Bjorn Helgaas, Jiaxun Yang Cc: Christian König, Huang Rui, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie, Daniel Vetter, bhelgaas, dri-devel, linux-kernel, linux-pci 在 2024-07-03星期三的 16:08 -0500,Bjorn Helgaas写道: > On Wed, Jul 03, 2024 at 04:52:30PM +0800, Jiaxun Yang wrote: > > 在2024年7月2日七月 下午6:03,Jiaxun Yang写道: > > > 在2024年7月2日七月 下午5:27,Christian König写道: > > > > Am 02.07.24 um 11:06 schrieb Icenowy Zheng: > > > > > [SNIP] However I don't think the definition of the AGP spec > > > > > could apply on all > > > > > PCI(e) implementations. The AGP spec itself don't apply on > > > > > implementations that do not implement AGP (which is the most > > > > > PCI(e) > > > > > implementations today), and it's not in the reference list of > > > > > the PCIe > > > > > spec, so it does no help on this context. > > > > No, exactly that is not correct. > > > > > > > > See as I explained the No-Snoop extension to PCIe was created > > > > to help > > > > with AGP support and later merged into the base PCIe > > > > specification. > > > > > > > > So the AGP spec is now part of the PCIe spec. > > > > Hi Bjorn & linux-pci folks, > > > > It seems like we have some disputes on interpretation pf PCIe > > specification. > > > > We are seeking your expertise on the question: Does PCIe > > specification mandate Cache coherency via snoop? > > I'm not qualified to opine on this. I'd say it's a question for the > PCI SIG protocol workgroup. https://forum.pcisig.com/ is a place to > start. Sorry for the disturbance. As individual hacker, I am not eligble of being a PCI-SIG member and join the discussion there. So I here want to ask a question as an individual hacker: what's the policy of linux-pci towards these non-coherent PCIe implementations? If the sentences of Christian is right, these implementations are just out-of-spec, should them get purged out of the kernel, or at least raising a warning that some HW won't work because of inconformant implementation? > > Bjorn ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: PCIe coherency in spec (was: [RFC PATCH 2/2] drm/ttm: downgrade cached to write_combined when snooping not available) 2024-07-04 2:00 ` Icenowy Zheng @ 2024-07-04 6:11 ` Christoph Hellwig 2024-07-04 6:40 ` Icenowy Zheng 0 siblings, 1 reply; 6+ messages in thread From: Christoph Hellwig @ 2024-07-04 6:11 UTC (permalink / raw) To: Icenowy Zheng Cc: Bjorn Helgaas, Jiaxun Yang, Christian König, Huang Rui, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie, Daniel Vetter, bhelgaas, dri-devel, linux-kernel, linux-pci On Thu, Jul 04, 2024 at 10:00:52AM +0800, Icenowy Zheng wrote: > So I here want to ask a question as an individual hacker: what's the > policy of linux-pci towards these non-coherent PCIe implementations? > > If the sentences of Christian is right, these implementations are just > out-of-spec, should them get purged out of the kernel, or at least > raising a warning that some HW won't work because of inconformant > implementation? Nothing in the PCIe specifications that mandates a programming model. Non-coherent DMA is extremely common in lower end devices, and despite all the issues that it causes well supported in Linux. What are you trying to solve? ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: PCIe coherency in spec (was: [RFC PATCH 2/2] drm/ttm: downgrade cached to write_combined when snooping not available) 2024-07-04 6:11 ` Christoph Hellwig @ 2024-07-04 6:40 ` Icenowy Zheng 2024-07-04 6:44 ` Christoph Hellwig 0 siblings, 1 reply; 6+ messages in thread From: Icenowy Zheng @ 2024-07-04 6:40 UTC (permalink / raw) To: Christoph Hellwig Cc: Bjorn Helgaas, Jiaxun Yang, Christian König, Huang Rui, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie, Daniel Vetter, bhelgaas, dri-devel, linux-kernel, linux-pci 在 2024-07-03星期三的 23:11 -0700,Christoph Hellwig写道: > On Thu, Jul 04, 2024 at 10:00:52AM +0800, Icenowy Zheng wrote: > > So I here want to ask a question as an individual hacker: what's > > the > > policy of linux-pci towards these non-coherent PCIe > > implementations? > > > > If the sentences of Christian is right, these implementations are > > just > > out-of-spec, should them get purged out of the kernel, or at least > > raising a warning that some HW won't work because of inconformant > > implementation? > > Nothing in the PCIe specifications that mandates a programming model. > Non-coherent DMA is extremely common in lower end devices, and > despite > all the issues that it causes well supported in Linux. > > What are you trying to solve? Currently the DRM TTM subsystem (and GPU drivers using it) will assume coherency and fail on these non-coherent systems with cryptic error messages (like `[drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring gfx test failed (-110)`) without mentioning coherency issues at all. My original patchset tries to solve this problem by make the TTM subsystem sensible of coherency status (and prevent CPU-side cached mapping when non-coherent), but got argued by TTM maintainer and the maintainer says TTM's ignorance on non-coherent systems is intentional. > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: PCIe coherency in spec (was: [RFC PATCH 2/2] drm/ttm: downgrade cached to write_combined when snooping not available) 2024-07-04 6:40 ` Icenowy Zheng @ 2024-07-04 6:44 ` Christoph Hellwig 0 siblings, 0 replies; 6+ messages in thread From: Christoph Hellwig @ 2024-07-04 6:44 UTC (permalink / raw) To: Icenowy Zheng Cc: Christoph Hellwig, Bjorn Helgaas, Jiaxun Yang, Christian König, Huang Rui, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie, Daniel Vetter, bhelgaas, dri-devel, linux-kernel, linux-pci On Thu, Jul 04, 2024 at 02:40:16PM +0800, Icenowy Zheng wrote: > > Nothing in the PCIe specifications that mandates a programming model. > > Non-coherent DMA is extremely common in lower end devices, and > > despite > > all the issues that it causes well supported in Linux. > > > > What are you trying to solve? > > Currently the DRM TTM subsystem (and GPU drivers using it) will assume > coherency and fail on these non-coherent systems with cryptic error > messages (like `[drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring gfx > test failed (-110)`) without mentioning coherency issues at all. > > My original patchset tries to solve this problem by make the TTM > subsystem sensible of coherency status (and prevent CPU-side cached > mapping when non-coherent), but got argued by TTM maintainer and the > maintainer says TTM's ignorance on non-coherent systems is intentional. From the dma mapping subsystem POV all drivers not supporting DMA incoherent devices are buggy. But if the drm maintaintainers disagree (and they have in the past) there is not much I can do, especially given the DRM is rather special in abuses of all kinds of APIs anyway. ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-07-04 6:44 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20240629052247.2653363-1-uwu@icenowy.me>
[not found] ` <20240629052247.2653363-3-uwu@icenowy.me>
[not found] ` <a143a2c3-c6f0-4537-acc6-94f229f14639@app.fastmail.com>
[not found] ` <2760BA02-8FF8-4B29-BFE2-1322B5BFB6EC@icenowy.me>
[not found] ` <7e30177b-ff13-4fed-aa51-47a9cbd5d572@amd.com>
[not found] ` <6303afecce2dff9e7d30f67e0a74205256e0a524.camel@icenowy.me>
[not found] ` <ff1bf596-83cb-4b3e-a33a-621ac2c8171c@amd.com>
[not found] ` <b9189c97f7efbaa895198113ee5b47012bd8b4dc.camel@icenowy.me>
[not found] ` <ae7085fd-3bca-4a4a-b465-5e4941011877@amd.com>
[not found] ` <fd1d0a97-7075-4936-b58b-e99bab9afc58@app.fastmail.com>
2024-07-03 8:52 ` PCIe coherency in spec (was: [RFC PATCH 2/2] drm/ttm: downgrade cached to write_combined when snooping not available) Jiaxun Yang
2024-07-03 21:08 ` Bjorn Helgaas
2024-07-04 2:00 ` Icenowy Zheng
2024-07-04 6:11 ` Christoph Hellwig
2024-07-04 6:40 ` Icenowy Zheng
2024-07-04 6:44 ` Christoph Hellwig
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox