public inbox for linuxppc-dev@ozlabs.org
 help / color / mirror / Atom feed
* amdgpu driver fails to initialize on ppc64le in 7.0-rc1 and newer
@ 2026-03-13 13:23 Dan Horák
  2026-03-15  4:25 ` Ritesh Harjani
  0 siblings, 1 reply; 17+ messages in thread
From: Dan Horák @ 2026-03-13 13:23 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: amd-gfx

Hi,

starting with 7.0-rc1 (meaning 6.19 is OK) the amdgpu driver fails to
initialize on my Linux/ppc64le Power9 based system (with Radeon Pro WX4100)
with the following in the log

...
bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: GART: 256M 0x000000FF00000000 - 0x000000FF0FFFFFFF
bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: [drm] Detected VRAM RAM=4096M, BAR=4096M
bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: [drm] RAM width 128bits GDDR5
bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: iommu: 64-bit OK but direct DMA is limited by 0
bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: dma_iommu_get_required_mask: returning bypass mask 0xfffffffffffffff
bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0:  4096M of VRAM memory ready
bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0:  32570M of GTT memory ready.
bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: (-12) failed to allocate kernel bo
bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: [drm] Debug VRAM access will use slowpath MM access
bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: [drm] GART: num cpu pages 4096, num gpu pages 65536
bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: [drm] PCIE GART of 256M enabled (table at 0x000000F4FFF80000).
bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: (-12) failed to allocate kernel bo
bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: (-12) create WB bo failed
bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: amdgpu_device_wb_init failed -12
bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: amdgpu_device_ip_init failed
bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: Fatal error during GPU init
bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: finishing device.
bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: probe with driver amdgpu failed with error -12
bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0:  ttm finalized
...

After some hints from Alex and bisecting and other investigation I have
found that https://github.com/torvalds/linux/commit/1471c517cf7dae1a6342fb821d8ed501af956dd0
is the culprit and reverting it makes amdgpu load (and work) again.

for the record, I have originally opened https://gitlab.freedesktop.org/drm/amd/-/issues/5039


	With regards,

		Dan


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: amdgpu driver fails to initialize on ppc64le in 7.0-rc1 and newer
  2026-03-13 13:23 amdgpu driver fails to initialize on ppc64le in 7.0-rc1 and newer Dan Horák
@ 2026-03-15  4:25 ` Ritesh Harjani
  2026-03-15  9:50   ` Dan Horák
                     ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Ritesh Harjani @ 2026-03-15  4:25 UTC (permalink / raw)
  To: Dan Horák, linuxppc-dev, Gaurav Batra; +Cc: amd-gfx, Donet Tom

Dan Horák <dan@danny.cz> writes:

+cc Gaurav,

> Hi,
>
> starting with 7.0-rc1 (meaning 6.19 is OK) the amdgpu driver fails to
> initialize on my Linux/ppc64le Power9 based system (with Radeon Pro WX4100)
> with the following in the log
>
> ...
> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: GART: 256M 0x000000FF00000000 - 0x000000FF0FFFFFFF

                  ^^^^
So looks like this is a PowerNV (Power9) machine.

> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: [drm] Detected VRAM RAM=4096M, BAR=4096M
> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: [drm] RAM width 128bits GDDR5
> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: iommu: 64-bit OK but direct DMA is limited by 0
> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: dma_iommu_get_required_mask: returning bypass mask 0xfffffffffffffff
> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0:  4096M of VRAM memory ready
> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0:  32570M of GTT memory ready.
> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: (-12) failed to allocate kernel bo
> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: [drm] Debug VRAM access will use slowpath MM access
> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: [drm] GART: num cpu pages 4096, num gpu pages 65536
> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: [drm] PCIE GART of 256M enabled (table at 0x000000F4FFF80000).
> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: (-12) failed to allocate kernel bo
> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: (-12) create WB bo failed
> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: amdgpu_device_wb_init failed -12
> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: amdgpu_device_ip_init failed
> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: Fatal error during GPU init
> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: finishing device.
> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: probe with driver amdgpu failed with error -12
> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0:  ttm finalized
> ...
>
> After some hints from Alex and bisecting and other investigation I have
> found that https://github.com/torvalds/linux/commit/1471c517cf7dae1a6342fb821d8ed501af956dd0
> is the culprit and reverting it makes amdgpu load (and work) again.

Thanks for confirming this. Yes, this was recently added [1]

[1]: https://lore.kernel.org/linuxppc-dev/20251107161105.85999-1-gbatra@linux.ibm.com/ 


@Gaurav,

I am not too familiar with the area, however looking at the logs shared
by Dan, it looks like we might be always going for dma direct allocation
path and maybe the device doesn't support this address limit. 

 bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: iommu: 64-bit OK but direct DMA is limited by 0
 bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: dma_iommu_get_required_mask: returning bypass mask 0xfffffffffffffff

Looking at the code..

diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
index fe7472f13b10..d5743b3c3ab3 100644
--- a/kernel/dma/mapping.c
+++ b/kernel/dma/mapping.c
@@ -654,7 +654,7 @@ void *dma_alloc_attrs(struct device *dev, size_t size, dma_addr_t *dma_handle,
 	/* let the implementation decide on the zone to allocate from: */
 	flag &= ~(__GFP_DMA | __GFP_DMA32 | __GFP_HIGHMEM);
 
-	if (dma_alloc_direct(dev, ops)) {
+	if (dma_alloc_direct(dev, ops) || arch_dma_alloc_direct(dev)) {
 		cpu_addr = dma_direct_alloc(dev, size, dma_handle, flag, attrs);
 	} else if (use_dma_iommu(dev)) {
 		cpu_addr = iommu_dma_alloc(dev, size, dma_handle, flag, attrs);

Now, do we need arch_dma_alloc_direct() here? It always returns true if
dev->dma_ops_bypass is set to true, w/o checking for checks that
dma_go_direct() has.

whereas...

/*
 * Check if the devices uses a direct mapping for streaming DMA operations.
 * This allows IOMMU drivers to set a bypass mode if the DMA mask is large
 * enough.
 */
static inline bool
dma_alloc_direct(struct device *dev, const struct dma_map_ops *ops)
..dma_go_direct(dev, dev->coherent_dma_mask, ops);
....  ...
      #ifdef CONFIG_DMA_OPS_BYPASS
          if (dev->dma_ops_bypass)
              return min_not_zero(mask, dev->bus_dma_limit) >=
                      dma_direct_get_required_mask(dev);
      #endif

dma_alloc_direct() already checks for dma_ops_bypass and also if
dev->coherent_dma_mask >= dma_direct_get_required_mask(). So...

.... Do we really need the machinary of arch_dma_{alloc|free}_direct()?
Isn't dma_alloc_direct() checks sufficient?

Thoughts?

-ritesh


>
> for the record, I have originally opened https://gitlab.freedesktop.org/drm/amd/-/issues/5039
>
>
> 	With regards,
>
> 		Dan


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: amdgpu driver fails to initialize on ppc64le in 7.0-rc1 and newer
  2026-03-15  4:25 ` Ritesh Harjani
@ 2026-03-15  9:50   ` Dan Horák
  2026-03-16 21:02     ` Gaurav Batra
  2026-03-17 11:43     ` Ritesh Harjani
  2026-03-16 13:55   ` Alex Deucher
  2026-03-23  0:30   ` Timothy Pearson
  2 siblings, 2 replies; 17+ messages in thread
From: Dan Horák @ 2026-03-15  9:50 UTC (permalink / raw)
  To: Ritesh Harjani; +Cc: linuxppc-dev, Gaurav Batra, amd-gfx, Donet Tom

Hi Ritesh,

On Sun, 15 Mar 2026 09:55:11 +0530
Ritesh Harjani (IBM) <ritesh.list@gmail.com> wrote:

> Dan Horák <dan@danny.cz> writes:
> 
> +cc Gaurav,
> 
> > Hi,
> >
> > starting with 7.0-rc1 (meaning 6.19 is OK) the amdgpu driver fails to
> > initialize on my Linux/ppc64le Power9 based system (with Radeon Pro WX4100)
> > with the following in the log
> >
> > ...
> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: GART: 256M 0x000000FF00000000 - 0x000000FF0FFFFFFF
> 
>                   ^^^^
> So looks like this is a PowerNV (Power9) machine.

correct :-)
 
> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: [drm] Detected VRAM RAM=4096M, BAR=4096M
> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: [drm] RAM width 128bits GDDR5
> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: iommu: 64-bit OK but direct DMA is limited by 0
> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: dma_iommu_get_required_mask: returning bypass mask 0xfffffffffffffff
> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0:  4096M of VRAM memory ready
> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0:  32570M of GTT memory ready.
> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: (-12) failed to allocate kernel bo
> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: [drm] Debug VRAM access will use slowpath MM access
> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: [drm] GART: num cpu pages 4096, num gpu pages 65536
> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: [drm] PCIE GART of 256M enabled (table at 0x000000F4FFF80000).
> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: (-12) failed to allocate kernel bo
> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: (-12) create WB bo failed
> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: amdgpu_device_wb_init failed -12
> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: amdgpu_device_ip_init failed
> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: Fatal error during GPU init
> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: finishing device.
> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: probe with driver amdgpu failed with error -12
> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0:  ttm finalized
> > ...
> >
> > After some hints from Alex and bisecting and other investigation I have
> > found that https://github.com/torvalds/linux/commit/1471c517cf7dae1a6342fb821d8ed501af956dd0
> > is the culprit and reverting it makes amdgpu load (and work) again.
> 
> Thanks for confirming this. Yes, this was recently added [1]
> 
> [1]: https://lore.kernel.org/linuxppc-dev/20251107161105.85999-1-gbatra@linux.ibm.com/ 
> 
> 
> @Gaurav,
> 
> I am not too familiar with the area, however looking at the logs shared
> by Dan, it looks like we might be always going for dma direct allocation
> path and maybe the device doesn't support this address limit. 
> 
>  bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: iommu: 64-bit OK but direct DMA is limited by 0
>  bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: dma_iommu_get_required_mask: returning bypass mask 0xfffffffffffffff

a complete kernel log is at
https://gitlab.freedesktop.org/-/project/4522/uploads/c4935bca6f37bbd06bb4045c07d00b5b/kernel.log

Please let me know if you need more info.


		Dan

 
> Looking at the code..
> 
> diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
> index fe7472f13b10..d5743b3c3ab3 100644
> --- a/kernel/dma/mapping.c
> +++ b/kernel/dma/mapping.c
> @@ -654,7 +654,7 @@ void *dma_alloc_attrs(struct device *dev, size_t size, dma_addr_t *dma_handle,
>  	/* let the implementation decide on the zone to allocate from: */
>  	flag &= ~(__GFP_DMA | __GFP_DMA32 | __GFP_HIGHMEM);
>  
> -	if (dma_alloc_direct(dev, ops)) {
> +	if (dma_alloc_direct(dev, ops) || arch_dma_alloc_direct(dev)) {
>  		cpu_addr = dma_direct_alloc(dev, size, dma_handle, flag, attrs);
>  	} else if (use_dma_iommu(dev)) {
>  		cpu_addr = iommu_dma_alloc(dev, size, dma_handle, flag, attrs);
> 
> Now, do we need arch_dma_alloc_direct() here? It always returns true if
> dev->dma_ops_bypass is set to true, w/o checking for checks that
> dma_go_direct() has.
> 
> whereas...
> 
> /*
>  * Check if the devices uses a direct mapping for streaming DMA operations.
>  * This allows IOMMU drivers to set a bypass mode if the DMA mask is large
>  * enough.
>  */
> static inline bool
> dma_alloc_direct(struct device *dev, const struct dma_map_ops *ops)
> ..dma_go_direct(dev, dev->coherent_dma_mask, ops);
> ....  ...
>       #ifdef CONFIG_DMA_OPS_BYPASS
>           if (dev->dma_ops_bypass)
>               return min_not_zero(mask, dev->bus_dma_limit) >=
>                       dma_direct_get_required_mask(dev);
>       #endif
> 
> dma_alloc_direct() already checks for dma_ops_bypass and also if
> dev->coherent_dma_mask >= dma_direct_get_required_mask(). So...
> 
> .... Do we really need the machinary of arch_dma_{alloc|free}_direct()?
> Isn't dma_alloc_direct() checks sufficient?
> 
> Thoughts?
> 
> -ritesh
> 
> 
> >
> > for the record, I have originally opened https://gitlab.freedesktop.org/drm/amd/-/issues/5039
> >
> >
> > 	With regards,
> >
> > 		Dan


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: amdgpu driver fails to initialize on ppc64le in 7.0-rc1 and newer
  2026-03-15  4:25 ` Ritesh Harjani
  2026-03-15  9:50   ` Dan Horák
@ 2026-03-16 13:55   ` Alex Deucher
  2026-03-23  0:30   ` Timothy Pearson
  2 siblings, 0 replies; 17+ messages in thread
From: Alex Deucher @ 2026-03-16 13:55 UTC (permalink / raw)
  To: Ritesh Harjani
  Cc: Dan Horák, linuxppc-dev, Gaurav Batra, amd-gfx, Donet Tom

On Mon, Mar 16, 2026 at 5:44 AM Ritesh Harjani <ritesh.list@gmail.com> wrote:
>
> Dan Horák <dan@danny.cz> writes:
>
> +cc Gaurav,
>
> > Hi,
> >
> > starting with 7.0-rc1 (meaning 6.19 is OK) the amdgpu driver fails to
> > initialize on my Linux/ppc64le Power9 based system (with Radeon Pro WX4100)
> > with the following in the log
> >
> > ...
> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: GART: 256M 0x000000FF00000000 - 0x000000FF0FFFFFFF
>
>                   ^^^^
> So looks like this is a PowerNV (Power9) machine.
>
> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: [drm] Detected VRAM RAM=4096M, BAR=4096M
> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: [drm] RAM width 128bits GDDR5
> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: iommu: 64-bit OK but direct DMA is limited by 0
> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: dma_iommu_get_required_mask: returning bypass mask 0xfffffffffffffff
> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0:  4096M of VRAM memory ready
> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0:  32570M of GTT memory ready.
> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: (-12) failed to allocate kernel bo
> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: [drm] Debug VRAM access will use slowpath MM access
> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: [drm] GART: num cpu pages 4096, num gpu pages 65536
> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: [drm] PCIE GART of 256M enabled (table at 0x000000F4FFF80000).
> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: (-12) failed to allocate kernel bo
> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: (-12) create WB bo failed
> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: amdgpu_device_wb_init failed -12
> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: amdgpu_device_ip_init failed
> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: Fatal error during GPU init
> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: finishing device.
> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: probe with driver amdgpu failed with error -12
> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0:  ttm finalized
> > ...
> >
> > After some hints from Alex and bisecting and other investigation I have
> > found that https://github.com/torvalds/linux/commit/1471c517cf7dae1a6342fb821d8ed501af956dd0
> > is the culprit and reverting it makes amdgpu load (and work) again.
>
> Thanks for confirming this. Yes, this was recently added [1]
>
> [1]: https://lore.kernel.org/linuxppc-dev/20251107161105.85999-1-gbatra@linux.ibm.com/
>
>
> @Gaurav,
>
> I am not too familiar with the area, however looking at the logs shared
> by Dan, it looks like we might be always going for dma direct allocation
> path and maybe the device doesn't support this address limit.

The device only supports a 40 bit DMA mask.

Alex

>
>  bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: iommu: 64-bit OK but direct DMA is limited by 0
>  bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: dma_iommu_get_required_mask: returning bypass mask 0xfffffffffffffff
>
> Looking at the code..
>
> diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
> index fe7472f13b10..d5743b3c3ab3 100644
> --- a/kernel/dma/mapping.c
> +++ b/kernel/dma/mapping.c
> @@ -654,7 +654,7 @@ void *dma_alloc_attrs(struct device *dev, size_t size, dma_addr_t *dma_handle,
>         /* let the implementation decide on the zone to allocate from: */
>         flag &= ~(__GFP_DMA | __GFP_DMA32 | __GFP_HIGHMEM);
>
> -       if (dma_alloc_direct(dev, ops)) {
> +       if (dma_alloc_direct(dev, ops) || arch_dma_alloc_direct(dev)) {
>                 cpu_addr = dma_direct_alloc(dev, size, dma_handle, flag, attrs);
>         } else if (use_dma_iommu(dev)) {
>                 cpu_addr = iommu_dma_alloc(dev, size, dma_handle, flag, attrs);
>
> Now, do we need arch_dma_alloc_direct() here? It always returns true if
> dev->dma_ops_bypass is set to true, w/o checking for checks that
> dma_go_direct() has.
>
> whereas...
>
> /*
>  * Check if the devices uses a direct mapping for streaming DMA operations.
>  * This allows IOMMU drivers to set a bypass mode if the DMA mask is large
>  * enough.
>  */
> static inline bool
> dma_alloc_direct(struct device *dev, const struct dma_map_ops *ops)
> ..dma_go_direct(dev, dev->coherent_dma_mask, ops);
> ....  ...
>       #ifdef CONFIG_DMA_OPS_BYPASS
>           if (dev->dma_ops_bypass)
>               return min_not_zero(mask, dev->bus_dma_limit) >=
>                       dma_direct_get_required_mask(dev);
>       #endif
>
> dma_alloc_direct() already checks for dma_ops_bypass and also if
> dev->coherent_dma_mask >= dma_direct_get_required_mask(). So...
>
> .... Do we really need the machinary of arch_dma_{alloc|free}_direct()?
> Isn't dma_alloc_direct() checks sufficient?
>
> Thoughts?
>
> -ritesh
>
>
> >
> > for the record, I have originally opened https://gitlab.freedesktop.org/drm/amd/-/issues/5039
> >
> >
> >       With regards,
> >
> >               Dan


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: amdgpu driver fails to initialize on ppc64le in 7.0-rc1 and newer
  2026-03-15  9:50   ` Dan Horák
@ 2026-03-16 21:02     ` Gaurav Batra
  2026-03-25 12:12       ` Ritesh Harjani
  2026-03-17 11:43     ` Ritesh Harjani
  1 sibling, 1 reply; 17+ messages in thread
From: Gaurav Batra @ 2026-03-16 21:02 UTC (permalink / raw)
  To: Dan Horák, Ritesh Harjani (IBM); +Cc: linuxppc-dev, amd-gfx, Donet Tom

[-- Attachment #1: Type: text/plain, Size: 8365 bytes --]

Hello Ritesh/Dan,


Here is the motivation for my patch and thoughts on the issue.


Before my patch, there were 2 scenarios to consider where, even when the 
memory
was pre-mapped for DMA, coherent allocations were getting mapped from 2GB
default DMA Window. In case of pre-mapped memory, the allocations should 
not be
directed towards 2GB default DMA window.

1. AMD GPU which has device DMA mask > 32 bits but less then 64 bits. In 
this
case the PHB is put into Limited Addressability mode.

    This scenario doesn't have vPMEM

2. Device that supports 64-bit DMA mask. The LPAR has vPMEM assigned.


In both the above scenarios, IOMMU has pre-mapped RAM from DDW (64-bit 
PPC DMA
window).


Lets consider code paths for both the case, before my patch

1. AMD GPU

dev->dma_ops_bypass = true

dev->bus_dma_limit = 0

- Here the AMD controller shows 3 functions on the PHB.

- After the first function is probed, it sees that the memory is pre-mapped
   and doesn't direct DMA allocations towards 2GB default window.
   So, dma_go_direct() worked as expected.

- AMD GPU driver, adds device memory to system pages. The stack is as below

add_pages+0x118/0x130 (unreliable)
pagemap_range+0x404/0x5e0
memremap_pages+0x15c/0x3d0
devm_memremap_pages+0x38/0xa0
kgd2kfd_init_zone_device+0x110/0x210 [amdgpu]
amdgpu_device_ip_init+0x648/0x6d8 [amdgpu]
amdgpu_device_init+0xb10/0x10c0 [amdgpu]
amdgpu_driver_load_kms+0x2c/0xb0 [amdgpu]
amdgpu_pci_probe+0x2e4/0x790 [amdgpu]

- This changed max_pfn to some high value beyond max RAM.

- Subsequently, for each other functions on the PHB, the call to
   dma_go_direct() will return false which will then direct DMA 
allocations towards
   2GB Default DMA window even if the memory is pre-mapped.

    dev->dma_ops_bypass is true, dma_direct_get_required_mask() resulted 
in large
    value for the mask (due to changed max_pfn) which is beyond AMD GPU 
device DMA mask


2. Device supports 64-bit DMA mask. The LPAR has vPMEM assigned

dev->dma_ops_bypass = false
dev->bus_dma_limit = has some value depending on size of RAM (eg.  
0x0800001000000000)

- Here the call to dma_go_direct() returns false since 
dev->dma_ops_bypass = false.



I crafted the solution to cover both the case. I tested today on an LPAR
with 7.0-rc4 and it works with AMDGPU.

With my patch, allocations will go towards direct only when 
dev->dma_ops_bypass = true,
which will be the case for "pre-mapped" RAM.

Ritesh mentioned that this is PowerNV. I need to revisit this patch and 
see why it
is failing on PowerNV. From the logs, I do see some issue. The log indicates
dev->bus_dma_limit is set to 0. This is incorrect. For pre-mapped RAM, 
with my
patch, bus_dma_limit should always be set to some value.

bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: iommu: 
64-bit OK but direct DMA is limited by *0*

Thanks,

Gaurav

On 3/15/26 4:50 AM, Dan Horák wrote:
> Hi Ritesh,
>
> On Sun, 15 Mar 2026 09:55:11 +0530
> Ritesh Harjani (IBM)<ritesh.list@gmail.com> wrote:
>
>> Dan Horák<dan@danny.cz> writes:
>>
>> +cc Gaurav,
>>
>>> Hi,
>>>
>>> starting with 7.0-rc1 (meaning 6.19 is OK) the amdgpu driver fails to
>>> initialize on my Linux/ppc64le Power9 based system (with Radeon Pro WX4100)
>>> with the following in the log
>>>
>>> ...
>>> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: GART: 256M 0x000000FF00000000 - 0x000000FF0FFFFFFF
>>                    ^^^^
>> So looks like this is a PowerNV (Power9) machine.
> correct :-)
>   
>>> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: [drm] Detected VRAM RAM=4096M, BAR=4096M
>>> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: [drm] RAM width 128bits GDDR5
>>> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: iommu: 64-bit OK but direct DMA is limited by 0
>>> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: dma_iommu_get_required_mask: returning bypass mask 0xfffffffffffffff
>>> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0:  4096M of VRAM memory ready
>>> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0:  32570M of GTT memory ready.
>>> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: (-12) failed to allocate kernel bo
>>> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: [drm] Debug VRAM access will use slowpath MM access
>>> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: [drm] GART: num cpu pages 4096, num gpu pages 65536
>>> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: [drm] PCIE GART of 256M enabled (table at 0x000000F4FFF80000).
>>> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: (-12) failed to allocate kernel bo
>>> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: (-12) create WB bo failed
>>> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: amdgpu_device_wb_init failed -12
>>> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: amdgpu_device_ip_init failed
>>> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: Fatal error during GPU init
>>> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: finishing device.
>>> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: probe with driver amdgpu failed with error -12
>>> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0:  ttm finalized
>>> ...
>>>
>>> After some hints from Alex and bisecting and other investigation I have
>>> found thathttps://github.com/torvalds/linux/commit/1471c517cf7dae1a6342fb821d8ed501af956dd0
>>> is the culprit and reverting it makes amdgpu load (and work) again.
>> Thanks for confirming this. Yes, this was recently added [1]
>>
>> [1]:https://lore.kernel.org/linuxppc-dev/20251107161105.85999-1-gbatra@linux.ibm.com/ 
>>
>>
>> @Gaurav,
>>
>> I am not too familiar with the area, however looking at the logs shared
>> by Dan, it looks like we might be always going for dma direct allocation
>> path and maybe the device doesn't support this address limit.
>>
>>   bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: iommu: 64-bit OK but direct DMA is limited by 0
>>   bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: dma_iommu_get_required_mask: returning bypass mask 0xfffffffffffffff
> a complete kernel log is at
> https://gitlab.freedesktop.org/-/project/4522/uploads/c4935bca6f37bbd06bb4045c07d00b5b/kernel.log
>
> Please let me know if you need more info.
>
>
> 		Dan
>
>   
>> Looking at the code..
>>
>> diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
>> index fe7472f13b10..d5743b3c3ab3 100644
>> --- a/kernel/dma/mapping.c
>> +++ b/kernel/dma/mapping.c
>> @@ -654,7 +654,7 @@ void *dma_alloc_attrs(struct device *dev, size_t size, dma_addr_t *dma_handle,
>>   	/* let the implementation decide on the zone to allocate from: */
>>   	flag &= ~(__GFP_DMA | __GFP_DMA32 | __GFP_HIGHMEM);
>>   
>> -	if (dma_alloc_direct(dev, ops)) {
>> +	if (dma_alloc_direct(dev, ops) || arch_dma_alloc_direct(dev)) {
>>   		cpu_addr = dma_direct_alloc(dev, size, dma_handle, flag, attrs);
>>   	} else if (use_dma_iommu(dev)) {
>>   		cpu_addr = iommu_dma_alloc(dev, size, dma_handle, flag, attrs);
>>
>> Now, do we need arch_dma_alloc_direct() here? It always returns true if
>> dev->dma_ops_bypass is set to true, w/o checking for checks that
>> dma_go_direct() has.
>>
>> whereas...
>>
>> /*
>>   * Check if the devices uses a direct mapping for streaming DMA operations.
>>   * This allows IOMMU drivers to set a bypass mode if the DMA mask is large
>>   * enough.
>>   */
>> static inline bool
>> dma_alloc_direct(struct device *dev, const struct dma_map_ops *ops)
>> ..dma_go_direct(dev, dev->coherent_dma_mask, ops);
>> ....  ...
>>        #ifdef CONFIG_DMA_OPS_BYPASS
>>            if (dev->dma_ops_bypass)
>>                return min_not_zero(mask, dev->bus_dma_limit) >=
>>                        dma_direct_get_required_mask(dev);
>>        #endif
>>
>> dma_alloc_direct() already checks for dma_ops_bypass and also if
>> dev->coherent_dma_mask >= dma_direct_get_required_mask(). So...
>>
>> .... Do we really need the machinary of arch_dma_{alloc|free}_direct()?
>> Isn't dma_alloc_direct() checks sufficient?
>>
>> Thoughts?
>>
>> -ritesh
>>
>>
>>> for the record, I have originally openedhttps://gitlab.freedesktop.org/drm/amd/-/issues/5039
>>>
>>>
>>> 	With regards,
>>>
>>> 		Dan

[-- Attachment #2: Type: text/html, Size: 10855 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: amdgpu driver fails to initialize on ppc64le in 7.0-rc1 and newer
  2026-03-15  9:50   ` Dan Horák
  2026-03-16 21:02     ` Gaurav Batra
@ 2026-03-17 11:43     ` Ritesh Harjani
  2026-03-17 14:31       ` Dan Horák
  2026-03-17 22:34       ` Karl Schimanek
  1 sibling, 2 replies; 17+ messages in thread
From: Ritesh Harjani @ 2026-03-17 11:43 UTC (permalink / raw)
  To: Dan Horák; +Cc: linuxppc-dev, Gaurav Batra, amd-gfx, Donet Tom

Dan Horák <dan@danny.cz> writes:

> Hi Ritesh,
>
> On Sun, 15 Mar 2026 09:55:11 +0530
> Ritesh Harjani (IBM) <ritesh.list@gmail.com> wrote:
>
>> Dan Horák <dan@danny.cz> writes:
>> 
>> +cc Gaurav,
>> 
>> > Hi,
>> >
>> > starting with 7.0-rc1 (meaning 6.19 is OK) the amdgpu driver fails to
>> > initialize on my Linux/ppc64le Power9 based system (with Radeon Pro WX4100)
>> > with the following in the log
>> >
>> > ...
>> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: GART: 256M 0x000000FF00000000 - 0x000000FF0FFFFFFF
>> 
>>                   ^^^^
>> So looks like this is a PowerNV (Power9) machine.
>
> correct :-)
>  
>> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: [drm] Detected VRAM RAM=4096M, BAR=4096M
>> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: [drm] RAM width 128bits GDDR5
>> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: iommu: 64-bit OK but direct DMA is limited by 0
>> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: dma_iommu_get_required_mask: returning bypass mask 0xfffffffffffffff
>> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0:  4096M of VRAM memory ready
>> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0:  32570M of GTT memory ready.
>> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: (-12) failed to allocate kernel bo
>> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: [drm] Debug VRAM access will use slowpath MM access
>> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: [drm] GART: num cpu pages 4096, num gpu pages 65536
>> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: [drm] PCIE GART of 256M enabled (table at 0x000000F4FFF80000).
>> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: (-12) failed to allocate kernel bo
>> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: (-12) create WB bo failed
>> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: amdgpu_device_wb_init failed -12
>> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: amdgpu_device_ip_init failed
>> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: Fatal error during GPU init
>> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: finishing device.
>> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: probe with driver amdgpu failed with error -12
>> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0:  ttm finalized
>> > ...
>> >
>> > After some hints from Alex and bisecting and other investigation I have
>> > found that https://github.com/torvalds/linux/commit/1471c517cf7dae1a6342fb821d8ed501af956dd0
>> > is the culprit and reverting it makes amdgpu load (and work) again.
>> 
>> Thanks for confirming this. Yes, this was recently added [1]
>> 
>> [1]: https://lore.kernel.org/linuxppc-dev/20251107161105.85999-1-gbatra@linux.ibm.com/ 
>> 
>> 
>> @Gaurav,
>> 
>> I am not too familiar with the area, however looking at the logs shared
>> by Dan, it looks like we might be always going for dma direct allocation
>> path and maybe the device doesn't support this address limit. 
>> 
>>  bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: iommu: 64-bit OK but direct DMA is limited by 0
>>  bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: dma_iommu_get_required_mask: returning bypass mask 0xfffffffffffffff
>
> a complete kernel log is at
> https://gitlab.freedesktop.org/-/project/4522/uploads/c4935bca6f37bbd06bb4045c07d00b5b/kernel.log
>
> Please let me know if you need more info.

Hi Dan,

Thanks for sharing the kernel log. Is it also possible to kindly share
your full kernel config with which you saw this issue.

I think Gaurav, is still looking into reported issue. However I was
interested in this kernel log output..

bře 05 08:35:34 talos.danny.cz kernel: radix-mmu: Mapped 0x00002007fad00000-0x00002007fcd00000 with 64.0 KiB pages

This shows that the system is using 64K pagesize. So I was interested in
knowing the kernel configs you have enabled. Donet has recently posted
64K pagesize support with amdgpu [1][2] on Power. However, I think, we
can still use it w/o Donet's changes if we have CONFIG_HSA_AMD_SVM
disabled.

So, can you kindly share the kernel configs and the AMD GPU HW details
attached to your Power9 baremetal system, if it's possible?

[1]: https://lore.kernel.org/amd-gfx/cover.1768223974.git.donettom@linux.ibm.com/#t     #merged
[2]: https://lore.kernel.org/amd-gfx/cover.1771656655.git.donettom@linux.ibm.com/       #in-review

-ritesh


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: amdgpu driver fails to initialize on ppc64le in 7.0-rc1 and newer
  2026-03-17 11:43     ` Ritesh Harjani
@ 2026-03-17 14:31       ` Dan Horák
  2026-03-17 22:34       ` Karl Schimanek
  1 sibling, 0 replies; 17+ messages in thread
From: Dan Horák @ 2026-03-17 14:31 UTC (permalink / raw)
  To: Ritesh Harjani; +Cc: linuxppc-dev, Gaurav Batra, amd-gfx, Donet Tom

Hi Ritesh,

On Tue, 17 Mar 2026 17:13:31 +0530
Ritesh Harjani (IBM) <ritesh.list@gmail.com> wrote:

> Dan Horák <dan@danny.cz> writes:
> 
> > Hi Ritesh,
> >
> > On Sun, 15 Mar 2026 09:55:11 +0530
> > Ritesh Harjani (IBM) <ritesh.list@gmail.com> wrote:
> >
> >> Dan Horák <dan@danny.cz> writes:
> >> 
> >> +cc Gaurav,
> >> 
> >> > Hi,
> >> >
> >> > starting with 7.0-rc1 (meaning 6.19 is OK) the amdgpu driver fails to
> >> > initialize on my Linux/ppc64le Power9 based system (with Radeon Pro WX4100)
> >> > with the following in the log
> >> >
> >> > ...
> >> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: GART: 256M 0x000000FF00000000 - 0x000000FF0FFFFFFF
> >> 
> >>                   ^^^^
> >> So looks like this is a PowerNV (Power9) machine.
> >
> > correct :-)
> >  
> >> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: [drm] Detected VRAM RAM=4096M, BAR=4096M
> >> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: [drm] RAM width 128bits GDDR5
> >> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: iommu: 64-bit OK but direct DMA is limited by 0
> >> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: dma_iommu_get_required_mask: returning bypass mask 0xfffffffffffffff
> >> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0:  4096M of VRAM memory ready
> >> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0:  32570M of GTT memory ready.
> >> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: (-12) failed to allocate kernel bo
> >> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: [drm] Debug VRAM access will use slowpath MM access
> >> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: [drm] GART: num cpu pages 4096, num gpu pages 65536
> >> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: [drm] PCIE GART of 256M enabled (table at 0x000000F4FFF80000).
> >> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: (-12) failed to allocate kernel bo
> >> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: (-12) create WB bo failed
> >> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: amdgpu_device_wb_init failed -12
> >> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: amdgpu_device_ip_init failed
> >> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: Fatal error during GPU init
> >> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: finishing device.
> >> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: probe with driver amdgpu failed with error -12
> >> > bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0:  ttm finalized
> >> > ...
> >> >
> >> > After some hints from Alex and bisecting and other investigation I have
> >> > found that https://github.com/torvalds/linux/commit/1471c517cf7dae1a6342fb821d8ed501af956dd0
> >> > is the culprit and reverting it makes amdgpu load (and work) again.
> >> 
> >> Thanks for confirming this. Yes, this was recently added [1]
> >> 
> >> [1]: https://lore.kernel.org/linuxppc-dev/20251107161105.85999-1-gbatra@linux.ibm.com/ 
> >> 
> >> 
> >> @Gaurav,
> >> 
> >> I am not too familiar with the area, however looking at the logs shared
> >> by Dan, it looks like we might be always going for dma direct allocation
> >> path and maybe the device doesn't support this address limit. 
> >> 
> >>  bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: iommu: 64-bit OK but direct DMA is limited by 0
> >>  bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: dma_iommu_get_required_mask: returning bypass mask 0xfffffffffffffff
> >
> > a complete kernel log is at
> > https://gitlab.freedesktop.org/-/project/4522/uploads/c4935bca6f37bbd06bb4045c07d00b5b/kernel.log
> >
> > Please let me know if you need more info.
> 
> Hi Dan,
> 
> Thanks for sharing the kernel log. Is it also possible to kindly share
> your full kernel config with which you saw this issue.

the log is from an official Fedora kernel, thus the config is
https://src.fedoraproject.org/rpms/kernel/blob/8477f609d4875a2c20717519243fb2e6fb1cdb8f/f/kernel-ppc64le-fedora.config

and yes, Fedora, like RHEL, uses 64k kernel page size for ppc64le and
except years ago I haven't had a 64k related issue with my card. IIRC
there were page size related issues with the newer (Navi?) cards, but
those also had been solved.

 
> I think Gaurav, is still looking into reported issue. However I was
> interested in this kernel log output..
> 
> bře 05 08:35:34 talos.danny.cz kernel: radix-mmu: Mapped 0x00002007fad00000-0x00002007fcd00000 with 64.0 KiB pages
> 
> This shows that the system is using 64K pagesize. So I was interested in
> knowing the kernel configs you have enabled. Donet has recently posted
> 64K pagesize support with amdgpu [1][2] on Power. However, I think, we
> can still use it w/o Donet's changes if we have CONFIG_HSA_AMD_SVM
> disabled.
> 
> So, can you kindly share the kernel configs and the AMD GPU HW details
> attached to your Power9 baremetal system, if it's possible?

output of "lspci -nn -vvv"

0000:01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Baffin [Radeon Pro WX 4100] [1002:67e3] (prog-if 00 [VGA controller])
        Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Device [1002:0b0d]
        Device tree node: /sys/firmware/devicetree/base/pciex@600c3c0000000/pci@0/vga@0
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0
        Interrupt: pin A routed to IRQ 194
        NUMA node: 0
        IOMMU group: 0
        Region 0: Memory at 6000000000000 (64-bit, prefetchable) [size=4G]
        Region 2: Memory at 6000100000000 (64-bit, prefetchable) [size=2M]
        Region 5: Memory at 600c000000000 (32-bit, non-prefetchable) [size=256K]
        Expansion ROM at 600c000040000 [disabled] [size=128K]
        Capabilities: [48] Vendor Specific Information: Len=08 <?>
        Capabilities: [50] Power Management version 3
                Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1+,D2+,D3hot+,D3cold+)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [58] Express (v2) Legacy Endpoint, IntMsgNum 0
                DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited
                        ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset- TEE-IO-
                DevCtl: CorrErr- NonFatalErr+ FatalErr+ UnsupReq+
                        RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
                        MaxPayload 256 bytes, MaxReadReq 512 bytes
                DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
                LnkCap: Port #0, Speed 8GT/s, Width x8, ASPM L1, Exit Latency L1 <1us
                        ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
                LnkCtl: ASPM Disabled; RCB 64 bytes, LnkDisable- CommClk-
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- FltModeDis-
                LnkSta: Speed 8GT/s, Width x8
                        TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Not Supported, TimeoutDis- NROPrPrP- LTR+
                         10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt+ EETLPPrefix+, MaxEETLPPrefixes 1
                         EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
                         FRS-
                         AtomicOpsCap: 32bit+ 64bit+ 128bitCAS-
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-
                         AtomicOpsCtl: ReqEn-
                         IDOReq- IDOCompl- LTR- EmergencyPowerReductionReq-
                         10BitTagReq- OBFF Disabled, EETLPPrefixBlk-
                LnkCap2: Supported Link Speeds: 2.5-8GT/s, Crosslink- Retimer- 2Retimers- DRS-
                LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                         Compliance Preset/De-emphasis: -6dB de-emphasis, 0dB preshoot
                LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+ EqualizationPhase1+
                         EqualizationPhase2+ EqualizationPhase3+ LinkEqualizationRequest-
                         Retimer- 2Retimers- CrosslinkRes: unsupported, FltMode-
        Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
                Address: 1000000000000000  Data: 0000
        Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
        Capabilities: [150 v2] Advanced Error Reporting
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP-
                        ECRC- UnsupReq- ACSViol- UncorrIntErr- BlockedTLP- AtomicOpBlocked- TLPBlockedErr-
                        PoisonTLPBlocked- DMWrReqBlocked- IDECheck- MisIDETLP- PCRC_CHECK- TLPXlatBlocked-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP-
                        ECRC- UnsupReq- ACSViol- UncorrIntErr- BlockedTLP- AtomicOpBlocked- TLPBlockedErr-
                        PoisonTLPBlocked- DMWrReqBlocked- IDECheck- MisIDETLP- PCRC_CHECK- TLPXlatBlocked-
                UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+
                        ECRC- UnsupReq- ACSViol- UncorrIntErr+ BlockedTLP- AtomicOpBlocked- TLPBlockedErr-
                        PoisonTLPBlocked- DMWrReqBlocked- IDECheck- MisIDETLP- PCRC_CHECK- TLPXlatBlocked-
                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr- CorrIntErr- HeaderOF-
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+ CorrIntErr- HeaderOF-
                AERCap: First Error Pointer: 00, ECRCGenCap+ ECRCGenEn+ ECRCChkCap+ ECRCChkEn+
                        MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
                HeaderLog: 00000000 00000000 00000000 00000000
        Capabilities: [200 v1] Physical Resizable BAR
                BAR 0: current size: 4GB, supported: 256MB 512MB 1GB 2GB 4GB
        Capabilities: [270 v1] Secondary PCI Express
                LnkCtl3: LnkEquIntrruptEn- PerformEqu-
                LaneErrStat: 0
        Capabilities: [2b0 v1] Address Translation Service (ATS)
                ATSCap: Invalidate Queue Depth: 00
                ATSCtl: Enable-, Smallest Translation Unit: 00
        Capabilities: [2c0 v1] Page Request Interface (PRI)
                PRICtl: Enable- Reset-
                PRISta: RF- UPRGI- Stopped+ PASID-
                Page Request Capacity: 00000020, Page Request Allocation: 00000000
        Capabilities: [2d0 v1] Process Address Space ID (PASID)
                PASIDCap: Exec+ Priv+, Max PASID Width: 10
                PASIDCtl: Enable- Exec- Priv-
        Capabilities: [320 v1] Latency Tolerance Reporting
                Max snoop latency: 0ns
                Max no snoop latency: 0ns
        Capabilities: [328 v1] Alternative Routing-ID Interpretation (ARI)
                ARICap: MFVC- ACS-, Next Function: 1
                ARICtl: MFVC- ACS-, Function Group: 0
        Capabilities: [370 v1] L1 PM Substates
                L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
                          PortCommonModeRestoreTime=0us PortTPowerOnTime=170us
                L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
                           T_CommonMode=0us LTR1.2_Threshold=0ns
                L1SubCtl2: T_PwrOn=10us
        Kernel driver in use: amdgpu
        Kernel modules: amdgpu


> [1]: https://lore.kernel.org/amd-gfx/cover.1768223974.git.donettom@linux.ibm.com/#t     #merged
> [2]: https://lore.kernel.org/amd-gfx/cover.1771656655.git.donettom@linux.ibm.com/       #in-review
> 
> -ritesh

if some other is needed, let me know


		Dan


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: amdgpu driver fails to initialize on ppc64le in 7.0-rc1 and newer
  2026-03-17 11:43     ` Ritesh Harjani
  2026-03-17 14:31       ` Dan Horák
@ 2026-03-17 22:34       ` Karl Schimanek
  1 sibling, 0 replies; 17+ messages in thread
From: Karl Schimanek @ 2026-03-17 22:34 UTC (permalink / raw)
  To: linuxppc-dev

Hi Ritesh,
Dan,

I can reproduce the issue on my system running Chimera Linux with 4K 
page size.

System details:
- Machine: RaptorCS Blackbird (POWER9, PowerNV)
- GPU: Sapphire Pulse Radeon RX 5500 XT (Navi 14)
- Kernel: 7.0-rc1

The failure is identical, ending in:

 > amdgpu ... (-12) failed to allocate kernel bo
 > amdgpu ... amdgpu_device_wb_init failed -12
 > amdgpu ... Fatal error during GPU init

I also see:

 > iommu: 64-bit OK but direct DMA is limited by 0
 > dma_iommu_get_required_mask: returning bypass mask 0xfffffffffffffff

This is with 4K pages, so it does not seem to be limited to 64K page 
configurations.

Regards,
Karl



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: amdgpu driver fails to initialize on ppc64le in 7.0-rc1 and newer
  2026-03-15  4:25 ` Ritesh Harjani
  2026-03-15  9:50   ` Dan Horák
  2026-03-16 13:55   ` Alex Deucher
@ 2026-03-23  0:30   ` Timothy Pearson
  2 siblings, 0 replies; 17+ messages in thread
From: Timothy Pearson @ 2026-03-23  0:30 UTC (permalink / raw)
  To: Ritesh Harjani
  Cc: Dan Horák, linuxppc-dev, Gaurav Batra, amd-gfx, Donet Tom



----- Original Message -----
> From: "Ritesh Harjani" <ritesh.list@gmail.com>
> To: "Dan Horák" <dan@danny.cz>, "linuxppc-dev" <linuxppc-dev@lists.ozlabs.org>, "Gaurav Batra" <gbatra@linux.ibm.com>
> Cc: "amd-gfx" <amd-gfx@lists.freedesktop.org>, "Donet Tom" <donettom@linux.ibm.com>
> Sent: Saturday, March 14, 2026 11:25:11 PM
> Subject: Re: amdgpu driver fails to initialize on ppc64le in 7.0-rc1 and newer

> Dan Horák <dan@danny.cz> writes:
> 
> +cc Gaurav,
> 
>> Hi,
>>
>> starting with 7.0-rc1 (meaning 6.19 is OK) the amdgpu driver fails to
>> initialize on my Linux/ppc64le Power9 based system (with Radeon Pro WX4100)
>> with the following in the log
>>
>> ...
>> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: GART: 256M
>> 0x000000FF00000000 - 0x000000FF0FFFFFFF
> 
>                  ^^^^
> So looks like this is a PowerNV (Power9) machine.
> 
>> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: [drm] Detected VRAM
>> RAM=4096M, BAR=4096M
>> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: [drm] RAM width
>> 128bits GDDR5
>> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: iommu: 64-bit OK but
>> direct DMA is limited by 0
>> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0:
>> dma_iommu_get_required_mask: returning bypass mask 0xfffffffffffffff
>> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0:  4096M of VRAM
>> memory ready
>> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0:  32570M of GTT
>> memory ready.
>> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: (-12) failed to
>> allocate kernel bo
>> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: [drm] Debug VRAM
>> access will use slowpath MM access
>> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: [drm] GART: num cpu
>> pages 4096, num gpu pages 65536
>> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: [drm] PCIE GART of
>> 256M enabled (table at 0x000000F4FFF80000).
>> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: (-12) failed to
>> allocate kernel bo
>> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: (-12) create WB bo
>> failed
>> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0:
>> amdgpu_device_wb_init failed -12
>> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0:
>> amdgpu_device_ip_init failed
>> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: Fatal error during
>> GPU init
>> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: finishing device.
>> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0: probe with driver
>> amdgpu failed with error -12
>> bře 05 08:35:40 talos.danny.cz kernel: amdgpu 0000:01:00.0:  ttm finalized
>> ...
>>
>> After some hints from Alex and bisecting and other investigation I have
>> found that
>> https://github.com/torvalds/linux/commit/1471c517cf7dae1a6342fb821d8ed501af956dd0
>> is the culprit and reverting it makes amdgpu load (and work) again.
> 
> Thanks for confirming this. Yes, this was recently added [1]
> 
> [1]:
> https://lore.kernel.org/linuxppc-dev/20251107161105.85999-1-gbatra@linux.ibm.com/

As this patch appears to be primarily aimed at improving performance, and has introduced a serious regression into the kernel for a large number of active users of the PowerNV platform, I would kindly ask that it be reverted until it can be reworked not to break PowerNV support.  Bear in mind there are other devices that are 40 bit DMA limited, and they are also likely to break on Linux 7.0.

Thank you!


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: amdgpu driver fails to initialize on ppc64le in 7.0-rc1 and newer
  2026-03-16 21:02     ` Gaurav Batra
@ 2026-03-25 12:12       ` Ritesh Harjani
  2026-03-25 14:56         ` Gaurav Batra
  2026-03-25 16:28         ` Gaurav Batra
  0 siblings, 2 replies; 17+ messages in thread
From: Ritesh Harjani @ 2026-03-25 12:12 UTC (permalink / raw)
  To: Gaurav Batra, Dan Horák; +Cc: linuxppc-dev, amd-gfx, Donet Tom

Gaurav Batra <gbatra@linux.ibm.com> writes:

Hi Gaurav,

> Hello Ritesh/Dan,
>
>
> Here is the motivation for my patch and thoughts on the issue.
>
>
> Before my patch, there were 2 scenarios to consider where, even when the 
> memory
> was pre-mapped for DMA, coherent allocations were getting mapped from 2GB
> default DMA Window. In case of pre-mapped memory, the allocations should 
> not be
> directed towards 2GB default DMA window.
>
> 1. AMD GPU which has device DMA mask > 32 bits but less then 64 bits. In 
> this
> case the PHB is put into Limited Addressability mode.
>
>     This scenario doesn't have vPMEM
>
> 2. Device that supports 64-bit DMA mask. The LPAR has vPMEM assigned.
>
>
> In both the above scenarios, IOMMU has pre-mapped RAM from DDW (64-bit 
> PPC DMA
> window).
>
>
> Lets consider code paths for both the case, before my patch
>
> 1. AMD GPU
>
> dev->dma_ops_bypass = true
>
> dev->bus_dma_limit = 0
>
> - Here the AMD controller shows 3 functions on the PHB.
>
> - After the first function is probed, it sees that the memory is pre-mapped
>    and doesn't direct DMA allocations towards 2GB default window.
>    So, dma_go_direct() worked as expected.
>
> - AMD GPU driver, adds device memory to system pages. The stack is as below
>
> add_pages+0x118/0x130 (unreliable)
> pagemap_range+0x404/0x5e0
> memremap_pages+0x15c/0x3d0
> devm_memremap_pages+0x38/0xa0
> kgd2kfd_init_zone_device+0x110/0x210 [amdgpu]
> amdgpu_device_ip_init+0x648/0x6d8 [amdgpu]
> amdgpu_device_init+0xb10/0x10c0 [amdgpu]
> amdgpu_driver_load_kms+0x2c/0xb0 [amdgpu]
> amdgpu_pci_probe+0x2e4/0x790 [amdgpu]
>
> - This changed max_pfn to some high value beyond max RAM.
>
> - Subsequently, for each other functions on the PHB, the call to
>    dma_go_direct() will return false which will then direct DMA 
> allocations towards
>    2GB Default DMA window even if the memory is pre-mapped.
>
>     dev->dma_ops_bypass is true, dma_direct_get_required_mask() resulted 
> in large
>     value for the mask (due to changed max_pfn) which is beyond AMD GPU 
> device DMA mask
>
>
> 2. Device supports 64-bit DMA mask. The LPAR has vPMEM assigned
>
> dev->dma_ops_bypass = false
> dev->bus_dma_limit = has some value depending on size of RAM (eg.  
> 0x0800001000000000)
>
> - Here the call to dma_go_direct() returns false since 
> dev->dma_ops_bypass = false.
>
>
>
> I crafted the solution to cover both the case. I tested today on an LPAR
> with 7.0-rc4 and it works with AMDGPU.
>
> With my patch, allocations will go towards direct only when 
> dev->dma_ops_bypass = true,
> which will be the case for "pre-mapped" RAM.
>
> Ritesh mentioned that this is PowerNV. I need to revisit this patch and 
> see why it is failing on PowerNV.
> ...
> From the logs, I do see some issue. The log indicates
> dev->bus_dma_limit is set to 0. This is incorrect. For pre-mapped RAM, 
> with my
> patch, bus_dma_limit should always be set to some value.
>

In that case, do you think adding an extra check for dev->bus_dma_limit
would help? I am sure you already would have thought of this and
probably are still working to find the correct fix?

+bool arch_dma_alloc_direct(struct device *dev)
+{
+	if (dev->dma_ops_bypass && dev->bus_dma_limit)
+		return true;
+
+	return false;
+}
+
+bool arch_dma_free_direct(struct device *dev, dma_addr_t dma_handle)
+{
+	if (!dev->dma_ops_bypass || !dev->bus_dma_limit)
+		return false;
+
+	return is_direct_handle(dev, dma_handle);
+}

<snip from Timothy>

> introduced a serious regression into the kernel for a large number of
> active users of the PowerNV platform, I would kindly ask that it be
> reverted until it can be reworked not to break PowerNV support.  Bear
> in mind there are other devices that are 40 bit DMA limited, and they
> are also likely to break on Linux 7.0.

Looks like more people are facing an issue with this now.

-ritesh


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: amdgpu driver fails to initialize on ppc64le in 7.0-rc1 and newer
  2026-03-25 12:12       ` Ritesh Harjani
@ 2026-03-25 14:56         ` Gaurav Batra
  2026-03-25 16:28         ` Gaurav Batra
  1 sibling, 0 replies; 17+ messages in thread
From: Gaurav Batra @ 2026-03-25 14:56 UTC (permalink / raw)
  To: Ritesh Harjani (IBM), Dan Horák; +Cc: linuxppc-dev, amd-gfx, Donet Tom

Hello Ritesh,

This fix needs to be re-worked. My design was to have both 
"dma_ops_bypass" and bus_dma_limit set in case of pre-mapped TCEs.

There were a lot of things to consider - a major one was "max_pfn". This 
is not just reflective of "max memory". max_pfn changes when a GPU 
memory is added, or when pmemory is converted to RAM with a DLPAR event. 
This results in max_pfn to be much higher value than the actual RAM.

This throws off dma_direct_get_required_mask().

It will be better to revert this patch and re-work.

Thanks,

Gaurav

On 3/25/26 7:12 AM, Ritesh Harjani (IBM) wrote:
> Gaurav Batra <gbatra@linux.ibm.com> writes:
>
> Hi Gaurav,
>
>> Hello Ritesh/Dan,
>>
>>
>> Here is the motivation for my patch and thoughts on the issue.
>>
>>
>> Before my patch, there were 2 scenarios to consider where, even when the
>> memory
>> was pre-mapped for DMA, coherent allocations were getting mapped from 2GB
>> default DMA Window. In case of pre-mapped memory, the allocations should
>> not be
>> directed towards 2GB default DMA window.
>>
>> 1. AMD GPU which has device DMA mask > 32 bits but less then 64 bits. In
>> this
>> case the PHB is put into Limited Addressability mode.
>>
>>      This scenario doesn't have vPMEM
>>
>> 2. Device that supports 64-bit DMA mask. The LPAR has vPMEM assigned.
>>
>>
>> In both the above scenarios, IOMMU has pre-mapped RAM from DDW (64-bit
>> PPC DMA
>> window).
>>
>>
>> Lets consider code paths for both the case, before my patch
>>
>> 1. AMD GPU
>>
>> dev->dma_ops_bypass = true
>>
>> dev->bus_dma_limit = 0
>>
>> - Here the AMD controller shows 3 functions on the PHB.
>>
>> - After the first function is probed, it sees that the memory is pre-mapped
>>     and doesn't direct DMA allocations towards 2GB default window.
>>     So, dma_go_direct() worked as expected.
>>
>> - AMD GPU driver, adds device memory to system pages. The stack is as below
>>
>> add_pages+0x118/0x130 (unreliable)
>> pagemap_range+0x404/0x5e0
>> memremap_pages+0x15c/0x3d0
>> devm_memremap_pages+0x38/0xa0
>> kgd2kfd_init_zone_device+0x110/0x210 [amdgpu]
>> amdgpu_device_ip_init+0x648/0x6d8 [amdgpu]
>> amdgpu_device_init+0xb10/0x10c0 [amdgpu]
>> amdgpu_driver_load_kms+0x2c/0xb0 [amdgpu]
>> amdgpu_pci_probe+0x2e4/0x790 [amdgpu]
>>
>> - This changed max_pfn to some high value beyond max RAM.
>>
>> - Subsequently, for each other functions on the PHB, the call to
>>     dma_go_direct() will return false which will then direct DMA
>> allocations towards
>>     2GB Default DMA window even if the memory is pre-mapped.
>>
>>      dev->dma_ops_bypass is true, dma_direct_get_required_mask() resulted
>> in large
>>      value for the mask (due to changed max_pfn) which is beyond AMD GPU
>> device DMA mask
>>
>>
>> 2. Device supports 64-bit DMA mask. The LPAR has vPMEM assigned
>>
>> dev->dma_ops_bypass = false
>> dev->bus_dma_limit = has some value depending on size of RAM (eg.
>> 0x0800001000000000)
>>
>> - Here the call to dma_go_direct() returns false since
>> dev->dma_ops_bypass = false.
>>
>>
>>
>> I crafted the solution to cover both the case. I tested today on an LPAR
>> with 7.0-rc4 and it works with AMDGPU.
>>
>> With my patch, allocations will go towards direct only when
>> dev->dma_ops_bypass = true,
>> which will be the case for "pre-mapped" RAM.
>>
>> Ritesh mentioned that this is PowerNV. I need to revisit this patch and
>> see why it is failing on PowerNV.
>> ...
>>  From the logs, I do see some issue. The log indicates
>> dev->bus_dma_limit is set to 0. This is incorrect. For pre-mapped RAM,
>> with my
>> patch, bus_dma_limit should always be set to some value.
>>
> In that case, do you think adding an extra check for dev->bus_dma_limit
> would help? I am sure you already would have thought of this and
> probably are still working to find the correct fix?
>
> +bool arch_dma_alloc_direct(struct device *dev)
> +{
> +	if (dev->dma_ops_bypass && dev->bus_dma_limit)
> +		return true;
> +
> +	return false;
> +}
> +
> +bool arch_dma_free_direct(struct device *dev, dma_addr_t dma_handle)
> +{
> +	if (!dev->dma_ops_bypass || !dev->bus_dma_limit)
> +		return false;
> +
> +	return is_direct_handle(dev, dma_handle);
> +}
>
> <snip from Timothy>
>
>> introduced a serious regression into the kernel for a large number of
>> active users of the PowerNV platform, I would kindly ask that it be
>> reverted until it can be reworked not to break PowerNV support.  Bear
>> in mind there are other devices that are 40 bit DMA limited, and they
>> are also likely to break on Linux 7.0.
> Looks like more people are facing an issue with this now.
>
> -ritesh


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: amdgpu driver fails to initialize on ppc64le in 7.0-rc1 and newer
  2026-03-25 12:12       ` Ritesh Harjani
  2026-03-25 14:56         ` Gaurav Batra
@ 2026-03-25 16:28         ` Gaurav Batra
  2026-03-25 17:42           ` Ritesh Harjani
  1 sibling, 1 reply; 17+ messages in thread
From: Gaurav Batra @ 2026-03-25 16:28 UTC (permalink / raw)
  To: Ritesh Harjani (IBM), Dan Horák; +Cc: linuxppc-dev, amd-gfx, Donet Tom

[-- Attachment #1: Type: text/plain, Size: 4531 bytes --]

Hello Ritesh

I think, what you are proposing to add dev->bus_dma_limit in the check 
might work. In the case of PowerNV, this is not set, but 
dev->dma_ops_bypass is set. So, for PowerNV, it will fall back to how it 
was before.

Also, since these both are set in LPAR mode, the current patch as-is 
will work.

Dan, can you please try Ritesh proposed fix on your PowerNV box? I am 
not able to lay my hands on a PowerNV box yet.

Thanks,

Gaurav

On 3/25/26 7:12 AM, Ritesh Harjani (IBM) wrote:
> Gaurav Batra<gbatra@linux.ibm.com> writes:
>
> Hi Gaurav,
>
>> Hello Ritesh/Dan,
>>
>>
>> Here is the motivation for my patch and thoughts on the issue.
>>
>>
>> Before my patch, there were 2 scenarios to consider where, even when the
>> memory
>> was pre-mapped for DMA, coherent allocations were getting mapped from 2GB
>> default DMA Window. In case of pre-mapped memory, the allocations should
>> not be
>> directed towards 2GB default DMA window.
>>
>> 1. AMD GPU which has device DMA mask > 32 bits but less then 64 bits. In
>> this
>> case the PHB is put into Limited Addressability mode.
>>
>>      This scenario doesn't have vPMEM
>>
>> 2. Device that supports 64-bit DMA mask. The LPAR has vPMEM assigned.
>>
>>
>> In both the above scenarios, IOMMU has pre-mapped RAM from DDW (64-bit
>> PPC DMA
>> window).
>>
>>
>> Lets consider code paths for both the case, before my patch
>>
>> 1. AMD GPU
>>
>> dev->dma_ops_bypass = true
>>
>> dev->bus_dma_limit = 0
>>
>> - Here the AMD controller shows 3 functions on the PHB.
>>
>> - After the first function is probed, it sees that the memory is pre-mapped
>>     and doesn't direct DMA allocations towards 2GB default window.
>>     So, dma_go_direct() worked as expected.
>>
>> - AMD GPU driver, adds device memory to system pages. The stack is as below
>>
>> add_pages+0x118/0x130 (unreliable)
>> pagemap_range+0x404/0x5e0
>> memremap_pages+0x15c/0x3d0
>> devm_memremap_pages+0x38/0xa0
>> kgd2kfd_init_zone_device+0x110/0x210 [amdgpu]
>> amdgpu_device_ip_init+0x648/0x6d8 [amdgpu]
>> amdgpu_device_init+0xb10/0x10c0 [amdgpu]
>> amdgpu_driver_load_kms+0x2c/0xb0 [amdgpu]
>> amdgpu_pci_probe+0x2e4/0x790 [amdgpu]
>>
>> - This changed max_pfn to some high value beyond max RAM.
>>
>> - Subsequently, for each other functions on the PHB, the call to
>>     dma_go_direct() will return false which will then direct DMA
>> allocations towards
>>     2GB Default DMA window even if the memory is pre-mapped.
>>
>>      dev->dma_ops_bypass is true, dma_direct_get_required_mask() resulted
>> in large
>>      value for the mask (due to changed max_pfn) which is beyond AMD GPU
>> device DMA mask
>>
>>
>> 2. Device supports 64-bit DMA mask. The LPAR has vPMEM assigned
>>
>> dev->dma_ops_bypass = false
>> dev->bus_dma_limit = has some value depending on size of RAM (eg.
>> 0x0800001000000000)
>>
>> - Here the call to dma_go_direct() returns false since
>> dev->dma_ops_bypass = false.
>>
>>
>>
>> I crafted the solution to cover both the case. I tested today on an LPAR
>> with 7.0-rc4 and it works with AMDGPU.
>>
>> With my patch, allocations will go towards direct only when
>> dev->dma_ops_bypass = true,
>> which will be the case for "pre-mapped" RAM.
>>
>> Ritesh mentioned that this is PowerNV. I need to revisit this patch and
>> see why it is failing on PowerNV.
>> ...
>>  From the logs, I do see some issue. The log indicates
>> dev->bus_dma_limit is set to 0. This is incorrect. For pre-mapped RAM,
>> with my
>> patch, bus_dma_limit should always be set to some value.
>>
> In that case, do you think adding an extra check for dev->bus_dma_limit
> would help? I am sure you already would have thought of this and
> probably are still working to find the correct fix?
>
> +bool arch_dma_alloc_direct(struct device *dev)
> +{
> +	if (dev->dma_ops_bypass && dev->bus_dma_limit)
> +		return true;
> +
> +	return false;
> +}
> +
> +bool arch_dma_free_direct(struct device *dev, dma_addr_t dma_handle)
> +{
> +	if (!dev->dma_ops_bypass || !dev->bus_dma_limit)
> +		return false;
> +
> +	return is_direct_handle(dev, dma_handle);
> +}
>
> <snip from Timothy>
>
>> introduced a serious regression into the kernel for a large number of
>> active users of the PowerNV platform, I would kindly ask that it be
>> reverted until it can be reworked not to break PowerNV support.  Bear
>> in mind there are other devices that are 40 bit DMA limited, and they
>> are also likely to break on Linux 7.0.
> Looks like more people are facing an issue with this now.
>
> -ritesh

[-- Attachment #2: Type: text/html, Size: 5116 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: amdgpu driver fails to initialize on ppc64le in 7.0-rc1 and newer
  2026-03-25 16:28         ` Gaurav Batra
@ 2026-03-25 17:42           ` Ritesh Harjani
  2026-03-25 20:00             ` Dan Horák
  2026-03-26 10:29             ` Dan Horák
  0 siblings, 2 replies; 17+ messages in thread
From: Ritesh Harjani @ 2026-03-25 17:42 UTC (permalink / raw)
  To: Gaurav Batra, Dan Horák; +Cc: linuxppc-dev, amd-gfx, Donet Tom

Gaurav Batra <gbatra@linux.ibm.com> writes:

> Hello Ritesh
>
> I think, what you are proposing to add dev->bus_dma_limit in the check 
> might work. In the case of PowerNV, this is not set, but 
> dev->dma_ops_bypass is set. So, for PowerNV, it will fall back to how it 
> was before.
>
> Also, since these both are set in LPAR mode, the current patch as-is 
> will work.
>
> Dan, can you please try Ritesh proposed fix on your PowerNV box? I am 
> not able to lay my hands on a PowerNV box yet.
>

It would be this diff then. Note, I have only compile tested it.

diff --git a/arch/powerpc/kernel/dma-iommu.c b/arch/powerpc/kernel/dma-iommu.c
index 73e10bd4d56d..8b4de508d2eb 100644
--- a/arch/powerpc/kernel/dma-iommu.c
+++ b/arch/powerpc/kernel/dma-iommu.c
@@ -67,7 +67,7 @@ bool arch_dma_unmap_sg_direct(struct device *dev, struct scatterlist *sg,
 }
 bool arch_dma_alloc_direct(struct device *dev)
 {
-       if (dev->dma_ops_bypass)
+       if (dev->dma_ops_bypass && dev->bus_dma_limit)
                return true;

        return false;
@@ -75,7 +75,7 @@ bool arch_dma_alloc_direct(struct device *dev)

 bool arch_dma_free_direct(struct device *dev, dma_addr_t dma_handle)
 {
-       if (!dev->dma_ops_bypass)
+       if (!dev->dma_ops_bypass || !dev->bus_dma_limit)
                return false;

        return is_direct_handle(dev, dma_handle);


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: amdgpu driver fails to initialize on ppc64le in 7.0-rc1 and newer
  2026-03-25 17:42           ` Ritesh Harjani
@ 2026-03-25 20:00             ` Dan Horák
  2026-03-26 10:29             ` Dan Horák
  1 sibling, 0 replies; 17+ messages in thread
From: Dan Horák @ 2026-03-25 20:00 UTC (permalink / raw)
  To: Ritesh Harjani; +Cc: Gaurav Batra, linuxppc-dev, amd-gfx, Donet Tom

On Wed, 25 Mar 2026 23:12:16 +0530
Ritesh Harjani (IBM) <ritesh.list@gmail.com> wrote:

> Gaurav Batra <gbatra@linux.ibm.com> writes:
> 
> > Hello Ritesh
> >
> > I think, what you are proposing to add dev->bus_dma_limit in the check 
> > might work. In the case of PowerNV, this is not set, but 
> > dev->dma_ops_bypass is set. So, for PowerNV, it will fall back to how it 
> > was before.
> >
> > Also, since these both are set in LPAR mode, the current patch as-is 
> > will work.
> >
> > Dan, can you please try Ritesh proposed fix on your PowerNV box? I am 
> > not able to lay my hands on a PowerNV box yet.
> >
> 
> It would be this diff then. Note, I have only compile tested it.
> 
> diff --git a/arch/powerpc/kernel/dma-iommu.c b/arch/powerpc/kernel/dma-iommu.c
> index 73e10bd4d56d..8b4de508d2eb 100644
> --- a/arch/powerpc/kernel/dma-iommu.c
> +++ b/arch/powerpc/kernel/dma-iommu.c
> @@ -67,7 +67,7 @@ bool arch_dma_unmap_sg_direct(struct device *dev, struct scatterlist *sg,
>  }
>  bool arch_dma_alloc_direct(struct device *dev)
>  {
> -       if (dev->dma_ops_bypass)
> +       if (dev->dma_ops_bypass && dev->bus_dma_limit)
>                 return true;
> 
>         return false;
> @@ -75,7 +75,7 @@ bool arch_dma_alloc_direct(struct device *dev)
> 
>  bool arch_dma_free_direct(struct device *dev, dma_addr_t dma_handle)
>  {
> -       if (!dev->dma_ops_bypass)
> +       if (!dev->dma_ops_bypass || !dev->bus_dma_limit)
>                 return false;
> 
>         return is_direct_handle(dev, dma_handle);

yup, I will give it a try in the morning and report back


		Dan


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: amdgpu driver fails to initialize on ppc64le in 7.0-rc1 and newer
  2026-03-25 17:42           ` Ritesh Harjani
  2026-03-25 20:00             ` Dan Horák
@ 2026-03-26 10:29             ` Dan Horák
  2026-03-26 10:38               ` Ritesh Harjani
  1 sibling, 1 reply; 17+ messages in thread
From: Dan Horák @ 2026-03-26 10:29 UTC (permalink / raw)
  To: Ritesh Harjani; +Cc: Gaurav Batra, linuxppc-dev, amd-gfx, Donet Tom

Hi Ritesh,

On Wed, 25 Mar 2026 23:12:16 +0530
Ritesh Harjani (IBM) <ritesh.list@gmail.com> wrote:

> Gaurav Batra <gbatra@linux.ibm.com> writes:
> 
> > Hello Ritesh
> >
> > I think, what you are proposing to add dev->bus_dma_limit in the check 
> > might work. In the case of PowerNV, this is not set, but 
> > dev->dma_ops_bypass is set. So, for PowerNV, it will fall back to how it 
> > was before.
> >
> > Also, since these both are set in LPAR mode, the current patch as-is 
> > will work.
> >
> > Dan, can you please try Ritesh proposed fix on your PowerNV box? I am 
> > not able to lay my hands on a PowerNV box yet.
> >
> 
> It would be this diff then. Note, I have only compile tested it.
> 
> diff --git a/arch/powerpc/kernel/dma-iommu.c b/arch/powerpc/kernel/dma-iommu.c
> index 73e10bd4d56d..8b4de508d2eb 100644
> --- a/arch/powerpc/kernel/dma-iommu.c
> +++ b/arch/powerpc/kernel/dma-iommu.c
> @@ -67,7 +67,7 @@ bool arch_dma_unmap_sg_direct(struct device *dev, struct scatterlist *sg,
>  }
>  bool arch_dma_alloc_direct(struct device *dev)
>  {
> -       if (dev->dma_ops_bypass)
> +       if (dev->dma_ops_bypass && dev->bus_dma_limit)
>                 return true;
> 
>         return false;
> @@ -75,7 +75,7 @@ bool arch_dma_alloc_direct(struct device *dev)
> 
>  bool arch_dma_free_direct(struct device *dev, dma_addr_t dma_handle)
>  {
> -       if (!dev->dma_ops_bypass)
> +       if (!dev->dma_ops_bypass || !dev->bus_dma_limit)
>                 return false;
> 
>         return is_direct_handle(dev, dma_handle);

this seems to fix the amdgpu initialization, full kernel log available
as https://fedora.danny.cz/tmp/kernel-7.0-rc5.log

Tested-by: Dan Horák <dan@danny.cz>


		Dan


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: amdgpu driver fails to initialize on ppc64le in 7.0-rc1 and newer
  2026-03-26 10:29             ` Dan Horák
@ 2026-03-26 10:38               ` Ritesh Harjani
  2026-03-26 13:37                 ` Gaurav Batra
  0 siblings, 1 reply; 17+ messages in thread
From: Ritesh Harjani @ 2026-03-26 10:38 UTC (permalink / raw)
  To: Dan Horák, Gaurav Batra; +Cc: linuxppc-dev, amd-gfx, Donet Tom

Dan Horák <dan@danny.cz> writes:

> Hi Ritesh,
>
> On Wed, 25 Mar 2026 23:12:16 +0530
> Ritesh Harjani (IBM) <ritesh.list@gmail.com> wrote:
>
>> Gaurav Batra <gbatra@linux.ibm.com> writes:
>> 
>> > Hello Ritesh
>> >
>> > I think, what you are proposing to add dev->bus_dma_limit in the check 
>> > might work. In the case of PowerNV, this is not set, but 
>> > dev->dma_ops_bypass is set. So, for PowerNV, it will fall back to how it 
>> > was before.
>> >
>> > Also, since these both are set in LPAR mode, the current patch as-is 
>> > will work.
>> >
>> > Dan, can you please try Ritesh proposed fix on your PowerNV box? I am 
>> > not able to lay my hands on a PowerNV box yet.
>> >
>> 
>> It would be this diff then. Note, I have only compile tested it.
>> 
>> diff --git a/arch/powerpc/kernel/dma-iommu.c b/arch/powerpc/kernel/dma-iommu.c
>> index 73e10bd4d56d..8b4de508d2eb 100644
>> --- a/arch/powerpc/kernel/dma-iommu.c
>> +++ b/arch/powerpc/kernel/dma-iommu.c
>> @@ -67,7 +67,7 @@ bool arch_dma_unmap_sg_direct(struct device *dev, struct scatterlist *sg,
>>  }
>>  bool arch_dma_alloc_direct(struct device *dev)
>>  {
>> -       if (dev->dma_ops_bypass)
>> +       if (dev->dma_ops_bypass && dev->bus_dma_limit)
>>                 return true;
>> 
>>         return false;
>> @@ -75,7 +75,7 @@ bool arch_dma_alloc_direct(struct device *dev)
>> 
>>  bool arch_dma_free_direct(struct device *dev, dma_addr_t dma_handle)
>>  {
>> -       if (!dev->dma_ops_bypass)
>> +       if (!dev->dma_ops_bypass || !dev->bus_dma_limit)
>>                 return false;
>> 
>>         return is_direct_handle(dev, dma_handle);
>
> this seems to fix the amdgpu initialization, full kernel log available
> as https://fedora.danny.cz/tmp/kernel-7.0-rc5.log
>
> Tested-by: Dan Horák <dan@danny.cz>
>

Thanks a lot Dan!

@Gaurav,
In that case, please feel free to take the diff and submit an official
patch (if you think this looks good for all cases). You might want to
test your previous usecase once, so that we don't see any new surprises
there :)

-ritesh


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: amdgpu driver fails to initialize on ppc64le in 7.0-rc1 and newer
  2026-03-26 10:38               ` Ritesh Harjani
@ 2026-03-26 13:37                 ` Gaurav Batra
  0 siblings, 0 replies; 17+ messages in thread
From: Gaurav Batra @ 2026-03-26 13:37 UTC (permalink / raw)
  To: Ritesh Harjani (IBM), Dan Horák; +Cc: linuxppc-dev, amd-gfx, Donet Tom

Thanks a lot Dan for testing Ritesh's patch.

@Ritesh, I need to test the 2 scenarios for which I sent the first patch,

Thanks,

Gaurav

On 3/26/26 5:38 AM, Ritesh Harjani (IBM) wrote:
> Dan Horák <dan@danny.cz> writes:
>
>> Hi Ritesh,
>>
>> On Wed, 25 Mar 2026 23:12:16 +0530
>> Ritesh Harjani (IBM) <ritesh.list@gmail.com> wrote:
>>
>>> Gaurav Batra <gbatra@linux.ibm.com> writes:
>>>
>>>> Hello Ritesh
>>>>
>>>> I think, what you are proposing to add dev->bus_dma_limit in the check
>>>> might work. In the case of PowerNV, this is not set, but
>>>> dev->dma_ops_bypass is set. So, for PowerNV, it will fall back to how it
>>>> was before.
>>>>
>>>> Also, since these both are set in LPAR mode, the current patch as-is
>>>> will work.
>>>>
>>>> Dan, can you please try Ritesh proposed fix on your PowerNV box? I am
>>>> not able to lay my hands on a PowerNV box yet.
>>>>
>>> It would be this diff then. Note, I have only compile tested it.
>>>
>>> diff --git a/arch/powerpc/kernel/dma-iommu.c b/arch/powerpc/kernel/dma-iommu.c
>>> index 73e10bd4d56d..8b4de508d2eb 100644
>>> --- a/arch/powerpc/kernel/dma-iommu.c
>>> +++ b/arch/powerpc/kernel/dma-iommu.c
>>> @@ -67,7 +67,7 @@ bool arch_dma_unmap_sg_direct(struct device *dev, struct scatterlist *sg,
>>>   }
>>>   bool arch_dma_alloc_direct(struct device *dev)
>>>   {
>>> -       if (dev->dma_ops_bypass)
>>> +       if (dev->dma_ops_bypass && dev->bus_dma_limit)
>>>                  return true;
>>>
>>>          return false;
>>> @@ -75,7 +75,7 @@ bool arch_dma_alloc_direct(struct device *dev)
>>>
>>>   bool arch_dma_free_direct(struct device *dev, dma_addr_t dma_handle)
>>>   {
>>> -       if (!dev->dma_ops_bypass)
>>> +       if (!dev->dma_ops_bypass || !dev->bus_dma_limit)
>>>                  return false;
>>>
>>>          return is_direct_handle(dev, dma_handle);
>> this seems to fix the amdgpu initialization, full kernel log available
>> as https://fedora.danny.cz/tmp/kernel-7.0-rc5.log
>>
>> Tested-by: Dan Horák <dan@danny.cz>
>>
> Thanks a lot Dan!
>
> @Gaurav,
> In that case, please feel free to take the diff and submit an official
> patch (if you think this looks good for all cases). You might want to
> test your previous usecase once, so that we don't see any new surprises
> there :)
>
> -ritesh


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2026-03-26 13:37 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-13 13:23 amdgpu driver fails to initialize on ppc64le in 7.0-rc1 and newer Dan Horák
2026-03-15  4:25 ` Ritesh Harjani
2026-03-15  9:50   ` Dan Horák
2026-03-16 21:02     ` Gaurav Batra
2026-03-25 12:12       ` Ritesh Harjani
2026-03-25 14:56         ` Gaurav Batra
2026-03-25 16:28         ` Gaurav Batra
2026-03-25 17:42           ` Ritesh Harjani
2026-03-25 20:00             ` Dan Horák
2026-03-26 10:29             ` Dan Horák
2026-03-26 10:38               ` Ritesh Harjani
2026-03-26 13:37                 ` Gaurav Batra
2026-03-17 11:43     ` Ritesh Harjani
2026-03-17 14:31       ` Dan Horák
2026-03-17 22:34       ` Karl Schimanek
2026-03-16 13:55   ` Alex Deucher
2026-03-23  0:30   ` Timothy Pearson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox