Hello Ritesh I think, what you are proposing to add dev->bus_dma_limit in the check might work. In the case of PowerNV, this is not set, but dev->dma_ops_bypass is set. So, for PowerNV, it will fall back to how it was before. Also, since these both are set in LPAR mode, the current patch as-is will work. Dan, can you please try Ritesh proposed fix on your PowerNV box? I am not able to lay my hands on a PowerNV box yet. Thanks, Gaurav On 3/25/26 7:12 AM, Ritesh Harjani (IBM) wrote: > Gaurav Batra writes: > > Hi Gaurav, > >> Hello Ritesh/Dan, >> >> >> Here is the motivation for my patch and thoughts on the issue. >> >> >> Before my patch, there were 2 scenarios to consider where, even when the >> memory >> was pre-mapped for DMA, coherent allocations were getting mapped from 2GB >> default DMA Window. In case of pre-mapped memory, the allocations should >> not be >> directed towards 2GB default DMA window. >> >> 1. AMD GPU which has device DMA mask > 32 bits but less then 64 bits. In >> this >> case the PHB is put into Limited Addressability mode. >> >>    This scenario doesn't have vPMEM >> >> 2. Device that supports 64-bit DMA mask. The LPAR has vPMEM assigned. >> >> >> In both the above scenarios, IOMMU has pre-mapped RAM from DDW (64-bit >> PPC DMA >> window). >> >> >> Lets consider code paths for both the case, before my patch >> >> 1. AMD GPU >> >> dev->dma_ops_bypass = true >> >> dev->bus_dma_limit = 0 >> >> - Here the AMD controller shows 3 functions on the PHB. >> >> - After the first function is probed, it sees that the memory is pre-mapped >>   and doesn't direct DMA allocations towards 2GB default window. >>   So, dma_go_direct() worked as expected. >> >> - AMD GPU driver, adds device memory to system pages. The stack is as below >> >> add_pages+0x118/0x130 (unreliable) >> pagemap_range+0x404/0x5e0 >> memremap_pages+0x15c/0x3d0 >> devm_memremap_pages+0x38/0xa0 >> kgd2kfd_init_zone_device+0x110/0x210 [amdgpu] >> amdgpu_device_ip_init+0x648/0x6d8 [amdgpu] >> amdgpu_device_init+0xb10/0x10c0 [amdgpu] >> amdgpu_driver_load_kms+0x2c/0xb0 [amdgpu] >> amdgpu_pci_probe+0x2e4/0x790 [amdgpu] >> >> - This changed max_pfn to some high value beyond max RAM. >> >> - Subsequently, for each other functions on the PHB, the call to >>   dma_go_direct() will return false which will then direct DMA >> allocations towards >>   2GB Default DMA window even if the memory is pre-mapped. >> >>    dev->dma_ops_bypass is true, dma_direct_get_required_mask() resulted >> in large >>    value for the mask (due to changed max_pfn) which is beyond AMD GPU >> device DMA mask >> >> >> 2. Device supports 64-bit DMA mask. The LPAR has vPMEM assigned >> >> dev->dma_ops_bypass = false >> dev->bus_dma_limit = has some value depending on size of RAM (eg. >> 0x0800001000000000) >> >> - Here the call to dma_go_direct() returns false since >> dev->dma_ops_bypass = false. >> >> >> >> I crafted the solution to cover both the case. I tested today on an LPAR >> with 7.0-rc4 and it works with AMDGPU. >> >> With my patch, allocations will go towards direct only when >> dev->dma_ops_bypass = true, >> which will be the case for "pre-mapped" RAM. >> >> Ritesh mentioned that this is PowerNV. I need to revisit this patch and >> see why it is failing on PowerNV. >> ... >> From the logs, I do see some issue. The log indicates >> dev->bus_dma_limit is set to 0. This is incorrect. For pre-mapped RAM, >> with my >> patch, bus_dma_limit should always be set to some value. >> > In that case, do you think adding an extra check for dev->bus_dma_limit > would help? I am sure you already would have thought of this and > probably are still working to find the correct fix? > > +bool arch_dma_alloc_direct(struct device *dev) > +{ > + if (dev->dma_ops_bypass && dev->bus_dma_limit) > + return true; > + > + return false; > +} > + > +bool arch_dma_free_direct(struct device *dev, dma_addr_t dma_handle) > +{ > + if (!dev->dma_ops_bypass || !dev->bus_dma_limit) > + return false; > + > + return is_direct_handle(dev, dma_handle); > +} > > > >> introduced a serious regression into the kernel for a large number of >> active users of the PowerNV platform, I would kindly ask that it be >> reverted until it can be reworked not to break PowerNV support. Bear >> in mind there are other devices that are 40 bit DMA limited, and they >> are also likely to break on Linux 7.0. > Looks like more people are facing an issue with this now. > > -ritesh