public inbox for linux-arm-kernel@lists.infradead.org
 help / color / mirror / Atom feed
* [PATCH] iommu/rockchip: fix page table allocation flags for v2 IOMMU
@ 2026-03-31  7:50 Midgy BALON
  2026-03-31  7:57 ` Shawn Lin
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Midgy BALON @ 2026-03-31  7:50 UTC (permalink / raw)
  To: iommu
  Cc: joro, will, robin.murphy, heiko, jonas, linux-arm-kernel,
	linux-rockchip, linux-kernel, stable, Midgy BALON

commit 2a7e6400f72b ("iommu: rockchip: Allocate tables from all
available memory for IOMMU v2") removed GFP_DMA32 from
iommu_data_ops_v2, reasoning that RK356x and RK3588 IOMMU v2 hardware
supports up to 40-bit physical addresses for page tables.  However, the
RK3568 IOMMU page-table walker uses a 32-bit AXI bus: it cannot access
physical addresses above 4 GB regardless of the address encoding range.

On boards with more than 4 GB of RAM (e.g. 8 GB LPDDR4X), removing
GFP_DMA32 causes two distinct failure modes:

1. Direct allocation above 4 GB: iommu_alloc_pages_sz() may return
   memory above 0x100000000.  The hardware page-table walker issues a
   bus error trying to dereference those addresses, causing an IOMMU
   fault on the first DMA transaction.

2. SWIOTLB bounce-buffer poisoning: without GFP_DMA32, page tables land
   above the SWIOTLB window.  dma_map_single() with DMA_BIT_MASK(32)
   then bounces them into a buffer below 4 GB.  rk_dte_get_page_table()
   returns phys_to_virt() of the bounce buffer address; PTEs are written
   there; the next dma_sync_single_for_device(DMA_TO_DEVICE) copies the
   original (zero) data back over the bounce buffer, silently erasing the
   freshly written PTEs.  The IOMMU faults because every PTE reads as zero.

Restore GFP_DMA32 (and DMA_BIT_MASK(32)) for iommu_data_ops_v2, which
currently only serves "rockchip,rk3568-iommu" in mainline.

Tested on Radxa ROCK 3B (RK3568, 8 GB LPDDR4X):
  - MobileNetV1 via RKNN: 5.8 ms/inference (IOMMU mode)
  - YOLOv5s 640x640 via RKNN: ~57 ms/inference (IOMMU mode)
  - No IOMMU faults, correct inference results

Fixes: 2a7e6400f72b ("iommu: rockchip: Allocate tables from all available memory for IOMMU v2")
Cc: stable@vger.kernel.org
Cc: Jonas Karlman <jonas@kwiboo.se>
Signed-off-by: Midgy BALON <midgy971@gmail.com>
---
 drivers/iommu/rockchip-iommu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
index 85f3667e797..8b45db29471 100644
--- a/drivers/iommu/rockchip-iommu.c
+++ b/drivers/iommu/rockchip-iommu.c
@@ -1358,8 +1358,8 @@ static struct rk_iommu_ops iommu_data_ops_v2 = {
 	.pt_address = &rk_dte_pt_address_v2,
 	.mk_dtentries = &rk_mk_dte_v2,
 	.mk_ptentries = &rk_mk_pte_v2,
-	.dma_bit_mask = DMA_BIT_MASK(40),
-	.gfp_flags = 0,
+	.dma_bit_mask = DMA_BIT_MASK(32),
+	.gfp_flags = GFP_DMA32,
 };
 
 static const struct of_device_id rk_iommu_dt_ids[] = {
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] iommu/rockchip: fix page table allocation flags for v2 IOMMU
  2026-03-31  7:50 [PATCH] iommu/rockchip: fix page table allocation flags for v2 IOMMU Midgy BALON
@ 2026-03-31  7:57 ` Shawn Lin
  2026-03-31 18:13 ` Jonas Karlman
  2026-04-01  7:48 ` Simon
  2 siblings, 0 replies; 8+ messages in thread
From: Shawn Lin @ 2026-03-31  7:57 UTC (permalink / raw)
  To: Midgy BALON
  Cc: shawn.lin, joro, will, robin.murphy, heiko, jonas,
	linux-arm-kernel, linux-rockchip, linux-kernel, stable, iommu,
	Simon Xue

+ Simon

在 2026/03/31 星期二 15:50, Midgy BALON 写道:
> commit 2a7e6400f72b ("iommu: rockchip: Allocate tables from all
> available memory for IOMMU v2") removed GFP_DMA32 from
> iommu_data_ops_v2, reasoning that RK356x and RK3588 IOMMU v2 hardware
> supports up to 40-bit physical addresses for page tables.  However, the
> RK3568 IOMMU page-table walker uses a 32-bit AXI bus: it cannot access
> physical addresses above 4 GB regardless of the address encoding range.
> 
> On boards with more than 4 GB of RAM (e.g. 8 GB LPDDR4X), removing
> GFP_DMA32 causes two distinct failure modes:
> 
> 1. Direct allocation above 4 GB: iommu_alloc_pages_sz() may return
>     memory above 0x100000000.  The hardware page-table walker issues a
>     bus error trying to dereference those addresses, causing an IOMMU
>     fault on the first DMA transaction.
> 
> 2. SWIOTLB bounce-buffer poisoning: without GFP_DMA32, page tables land
>     above the SWIOTLB window.  dma_map_single() with DMA_BIT_MASK(32)
>     then bounces them into a buffer below 4 GB.  rk_dte_get_page_table()
>     returns phys_to_virt() of the bounce buffer address; PTEs are written
>     there; the next dma_sync_single_for_device(DMA_TO_DEVICE) copies the
>     original (zero) data back over the bounce buffer, silently erasing the
>     freshly written PTEs.  The IOMMU faults because every PTE reads as zero.
> 
> Restore GFP_DMA32 (and DMA_BIT_MASK(32)) for iommu_data_ops_v2, which
> currently only serves "rockchip,rk3568-iommu" in mainline.
> 
> Tested on Radxa ROCK 3B (RK3568, 8 GB LPDDR4X):
>    - MobileNetV1 via RKNN: 5.8 ms/inference (IOMMU mode)
>    - YOLOv5s 640x640 via RKNN: ~57 ms/inference (IOMMU mode)
>    - No IOMMU faults, correct inference results
> 
> Fixes: 2a7e6400f72b ("iommu: rockchip: Allocate tables from all available memory for IOMMU v2")
> Cc: stable@vger.kernel.org
> Cc: Jonas Karlman <jonas@kwiboo.se>
> Signed-off-by: Midgy BALON <midgy971@gmail.com>
> ---
>   drivers/iommu/rockchip-iommu.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
> index 85f3667e797..8b45db29471 100644
> --- a/drivers/iommu/rockchip-iommu.c
> +++ b/drivers/iommu/rockchip-iommu.c
> @@ -1358,8 +1358,8 @@ static struct rk_iommu_ops iommu_data_ops_v2 = {
>   	.pt_address = &rk_dte_pt_address_v2,
>   	.mk_dtentries = &rk_mk_dte_v2,
>   	.mk_ptentries = &rk_mk_pte_v2,
> -	.dma_bit_mask = DMA_BIT_MASK(40),
> -	.gfp_flags = 0,
> +	.dma_bit_mask = DMA_BIT_MASK(32),
> +	.gfp_flags = GFP_DMA32,
>   };
>   
>   static const struct of_device_id rk_iommu_dt_ids[] = {


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] iommu/rockchip: fix page table allocation flags for v2 IOMMU
  2026-03-31  7:50 [PATCH] iommu/rockchip: fix page table allocation flags for v2 IOMMU Midgy BALON
  2026-03-31  7:57 ` Shawn Lin
@ 2026-03-31 18:13 ` Jonas Karlman
  2026-04-01  7:48 ` Simon
  2 siblings, 0 replies; 8+ messages in thread
From: Jonas Karlman @ 2026-03-31 18:13 UTC (permalink / raw)
  To: Midgy BALON
  Cc: Shawn Lin, Simon Xue, iommu, joro, will, robin.murphy, heiko,
	linux-arm-kernel, linux-rockchip, linux-kernel, stable

Hi Midgy,

On 3/31/2026 9:50 AM, Midgy BALON wrote:
> commit 2a7e6400f72b ("iommu: rockchip: Allocate tables from all
> available memory for IOMMU v2") removed GFP_DMA32 from
> iommu_data_ops_v2, reasoning that RK356x and RK3588 IOMMU v2 hardware
> supports up to 40-bit physical addresses for page tables.  However, the
> RK3568 IOMMU page-table walker uses a 32-bit AXI bus: it cannot access
> physical addresses above 4 GB regardless of the address encoding range.
> 
> On boards with more than 4 GB of RAM (e.g. 8 GB LPDDR4X), removing
> GFP_DMA32 causes two distinct failure modes:
> 
> 1. Direct allocation above 4 GB: iommu_alloc_pages_sz() may return
>    memory above 0x100000000.  The hardware page-table walker issues a
>    bus error trying to dereference those addresses, causing an IOMMU
>    fault on the first DMA transaction.
> 
> 2. SWIOTLB bounce-buffer poisoning: without GFP_DMA32, page tables land
>    above the SWIOTLB window.  dma_map_single() with DMA_BIT_MASK(32)
>    then bounces them into a buffer below 4 GB.  rk_dte_get_page_table()
>    returns phys_to_virt() of the bounce buffer address; PTEs are written
>    there; the next dma_sync_single_for_device(DMA_TO_DEVICE) copies the
>    original (zero) data back over the bounce buffer, silently erasing the
>    freshly written PTEs.  The IOMMU faults because every PTE reads as zero.
> 
> Restore GFP_DMA32 (and DMA_BIT_MASK(32)) for iommu_data_ops_v2, which
> currently only serves "rockchip,rk3568-iommu" in mainline.
> 
> Tested on Radxa ROCK 3B (RK3568, 8 GB LPDDR4X):
>   - MobileNetV1 via RKNN: 5.8 ms/inference (IOMMU mode)
>   - YOLOv5s 640x640 via RKNN: ~57 ms/inference (IOMMU mode)
>   - No IOMMU faults, correct inference results
> 
> Fixes: 2a7e6400f72b ("iommu: rockchip: Allocate tables from all available memory for IOMMU v2")
> Cc: stable@vger.kernel.org
> Cc: Jonas Karlman <jonas@kwiboo.se>
> Signed-off-by: Midgy BALON <midgy971@gmail.com>
> ---
>  drivers/iommu/rockchip-iommu.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
> index 85f3667e797..8b45db29471 100644
> --- a/drivers/iommu/rockchip-iommu.c
> +++ b/drivers/iommu/rockchip-iommu.c
> @@ -1358,8 +1358,8 @@ static struct rk_iommu_ops iommu_data_ops_v2 = {
>  	.pt_address = &rk_dte_pt_address_v2,
>  	.mk_dtentries = &rk_mk_dte_v2,
>  	.mk_ptentries = &rk_mk_pte_v2,
> -	.dma_bit_mask = DMA_BIT_MASK(40),
> -	.gfp_flags = 0,
> +	.dma_bit_mask = DMA_BIT_MASK(32),
> +	.gfp_flags = GFP_DMA32,

This change is wrong because this struct describe the RK IOMMU v2 that
is capable of 40-bit addressing, used with e.g. RK3568 VOP2 MMU and MMUs
in other RK35xx SoCs.

What you have discovered is most likely that some IP blocks, e.g. NPU on
RK3568, is not capable of >32-bit addressing, and/or that such IP blocks
are still using IOMMU v1 blocks, or some variant with 32-bit limitation.

However, the RK IOMMU driver is currently not capable of supporting
different IOMMU revisions, if I recall correctly there may have been a
patch trying to address that already on ML.

Have you seen this issue with a variant of the rockit driver that add
support for RK3568 or a variant of the downstream rknpu driver forward
ported to mainline?

If your findings are correct it is likely that the NPU MMU needs to use
a different compatible, since rockchip,rk3568-iommu describe the IOMMUv2
that is capable of 40-bit addressing and is also used by other RK35xx
SoCs.

Regards,
Jonas

>  };
>  
>  static const struct of_device_id rk_iommu_dt_ids[] = {



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] iommu/rockchip: fix page table allocation flags for v2 IOMMU
  2026-03-31  7:50 [PATCH] iommu/rockchip: fix page table allocation flags for v2 IOMMU Midgy BALON
  2026-03-31  7:57 ` Shawn Lin
  2026-03-31 18:13 ` Jonas Karlman
@ 2026-04-01  7:48 ` Simon
  2026-04-01  8:41   ` Jonas Karlman
  2 siblings, 1 reply; 8+ messages in thread
From: Simon @ 2026-04-01  7:48 UTC (permalink / raw)
  To: Midgy BALON, iommu
  Cc: joro, will, robin.murphy, heiko, jonas, linux-arm-kernel,
	linux-rockchip, linux-kernel, stable

Hi Midgy,

在 2026/3/31 15:50, Midgy BALON 写道:
> commit 2a7e6400f72b ("iommu: rockchip: Allocate tables from all
> available memory for IOMMU v2") removed GFP_DMA32 from
> iommu_data_ops_v2, reasoning that RK356x and RK3588 IOMMU v2 hardware
> supports up to 40-bit physical addresses for page tables.  However, the
> RK3568 IOMMU page-table walker uses a 32-bit AXI bus: it cannot access
> physical addresses above 4 GB regardless of the address encoding range.
>
> On boards with more than 4 GB of RAM (e.g. 8 GB LPDDR4X), removing
> GFP_DMA32 causes two distinct failure modes:
>
> 1. Direct allocation above 4 GB: iommu_alloc_pages_sz() may return
>     memory above 0x100000000.  The hardware page-table walker issues a
>     bus error trying to dereference those addresses, causing an IOMMU
>     fault on the first DMA transaction.
Which IP block is hitting this? We'd like to take a look on our end.
> 2. SWIOTLB bounce-buffer poisoning: without GFP_DMA32, page tables land
>     above the SWIOTLB window.  dma_map_single() with DMA_BIT_MASK(32)
>     then bounces them into a buffer below 4 GB.  rk_dte_get_page_table()
>     returns phys_to_virt() of the bounce buffer address; PTEs are written
>     there; the next dma_sync_single_for_device(DMA_TO_DEVICE) copies the
>     original (zero) data back over the bounce buffer, silently erasing the
>     freshly written PTEs.  The IOMMU faults because every PTE reads as zero.
This probably need a separate patch. One way to fix it would be to track the
original L2 page table base addresses in struct rk_iommu_domain,
then have rk_dte_get_page_table() return the tracked address instead of
deriving it from the DTE.
> Restore GFP_DMA32 (and DMA_BIT_MASK(32)) for iommu_data_ops_v2, which
> currently only serves "rockchip,rk3568-iommu" in mainline.
>
> Tested on Radxa ROCK 3B (RK3568, 8 GB LPDDR4X):
>    - MobileNetV1 via RKNN: 5.8 ms/inference (IOMMU mode)
>    - YOLOv5s 640x640 via RKNN: ~57 ms/inference (IOMMU mode)
>    - No IOMMU faults, correct inference results
>
> Fixes: 2a7e6400f72b ("iommu: rockchip: Allocate tables from all available memory for IOMMU v2")
> Cc: stable@vger.kernel.org
> Cc: Jonas Karlman <jonas@kwiboo.se>
> Signed-off-by: Midgy BALON <midgy971@gmail.com>
> ---
>   drivers/iommu/rockchip-iommu.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
> index 85f3667e797..8b45db29471 100644
> --- a/drivers/iommu/rockchip-iommu.c
> +++ b/drivers/iommu/rockchip-iommu.c
> @@ -1358,8 +1358,8 @@ static struct rk_iommu_ops iommu_data_ops_v2 = {
>   	.pt_address = &rk_dte_pt_address_v2,
>   	.mk_dtentries = &rk_mk_dte_v2,
>   	.mk_ptentries = &rk_mk_pte_v2,
> -	.dma_bit_mask = DMA_BIT_MASK(40),
> -	.gfp_flags = 0,
> +	.dma_bit_mask = DMA_BIT_MASK(32),
> +	.gfp_flags = GFP_DMA32,
>   };
>   
>   static const struct of_device_id rk_iommu_dt_ids[] = {


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] iommu/rockchip: fix page table allocation flags for v2 IOMMU
  2026-04-01  7:48 ` Simon
@ 2026-04-01  8:41   ` Jonas Karlman
  2026-04-01 10:22     ` Simon Xue
  0 siblings, 1 reply; 8+ messages in thread
From: Jonas Karlman @ 2026-04-01  8:41 UTC (permalink / raw)
  To: Simon, Midgy BALON
  Cc: iommu, joro, will, robin.murphy, heiko, linux-arm-kernel,
	linux-rockchip, linux-kernel, stable

Hi Simon,

On 4/1/2026 9:48 AM, Simon wrote:
> Hi Midgy,
> 
> 在 2026/3/31 15:50, Midgy BALON 写道:
>> commit 2a7e6400f72b ("iommu: rockchip: Allocate tables from all
>> available memory for IOMMU v2") removed GFP_DMA32 from
>> iommu_data_ops_v2, reasoning that RK356x and RK3588 IOMMU v2 hardware
>> supports up to 40-bit physical addresses for page tables.  However, the
>> RK3568 IOMMU page-table walker uses a 32-bit AXI bus: it cannot access
>> physical addresses above 4 GB regardless of the address encoding range.
>>
>> On boards with more than 4 GB of RAM (e.g. 8 GB LPDDR4X), removing
>> GFP_DMA32 causes two distinct failure modes:
>>
>> 1. Direct allocation above 4 GB: iommu_alloc_pages_sz() may return
>>     memory above 0x100000000.  The hardware page-table walker issues a
>>     bus error trying to dereference those addresses, causing an IOMMU
>>     fault on the first DMA transaction.
>
> Which IP block is hitting this? We'd like to take a look on our end.

I have seen reports that the NPU MMU on RK3568/RK3566 is having some
issue using DTE/PTE with >32-bit addresses, maybe it uses a different
MMU hw revision or has some hw errata?

From my own testing at least the VOP2 MMU on RK3568 (and RK3588) was
able to handle 40-bit addressable DTE/PTE, hence the original commit
2a7e6400f72b ("iommu: rockchip: Allocate tables from all available
memory for IOMMU v2").

As also mentioned in my reply at [1], maybe the NPU MMU has some hw
limitation or errata and may need to use a different compatible.

[1] https://lore.kernel.org/r/3cd63b3d-1c5e-4a11-856e-c4aeb5d97d55@kwiboo.se/

Regards,
Jonas

>
>> 2. SWIOTLB bounce-buffer poisoning: without GFP_DMA32, page tables land
>>     above the SWIOTLB window.  dma_map_single() with DMA_BIT_MASK(32)
>>     then bounces them into a buffer below 4 GB.  rk_dte_get_page_table()
>>     returns phys_to_virt() of the bounce buffer address; PTEs are written
>>     there; the next dma_sync_single_for_device(DMA_TO_DEVICE) copies the
>>     original (zero) data back over the bounce buffer, silently erasing the
>>     freshly written PTEs.  The IOMMU faults because every PTE reads as zero.
>
> This probably need a separate patch. One way to fix it would be to track the
> original L2 page table base addresses in struct rk_iommu_domain,
> then have rk_dte_get_page_table() return the tracked address instead of
> deriving it from the DTE.
>
>> Restore GFP_DMA32 (and DMA_BIT_MASK(32)) for iommu_data_ops_v2, which
>> currently only serves "rockchip,rk3568-iommu" in mainline.
>>
>> Tested on Radxa ROCK 3B (RK3568, 8 GB LPDDR4X):
>>    - MobileNetV1 via RKNN: 5.8 ms/inference (IOMMU mode)
>>    - YOLOv5s 640x640 via RKNN: ~57 ms/inference (IOMMU mode)
>>    - No IOMMU faults, correct inference results
>>
>> Fixes: 2a7e6400f72b ("iommu: rockchip: Allocate tables from all available memory for IOMMU v2")
>> Cc: stable@vger.kernel.org
>> Cc: Jonas Karlman <jonas@kwiboo.se>
>> Signed-off-by: Midgy BALON <midgy971@gmail.com>
>> ---
>>   drivers/iommu/rockchip-iommu.c | 4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
>> index 85f3667e797..8b45db29471 100644
>> --- a/drivers/iommu/rockchip-iommu.c
>> +++ b/drivers/iommu/rockchip-iommu.c
>> @@ -1358,8 +1358,8 @@ static struct rk_iommu_ops iommu_data_ops_v2 = {
>>   	.pt_address = &rk_dte_pt_address_v2,
>>   	.mk_dtentries = &rk_mk_dte_v2,
>>   	.mk_ptentries = &rk_mk_pte_v2,
>> -	.dma_bit_mask = DMA_BIT_MASK(40),
>> -	.gfp_flags = 0,
>> +	.dma_bit_mask = DMA_BIT_MASK(32),
>> +	.gfp_flags = GFP_DMA32,
>>   };
>>   
>>   static const struct of_device_id rk_iommu_dt_ids[] = {



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] iommu/rockchip: fix page table allocation flags for v2 IOMMU
  2026-04-01  8:41   ` Jonas Karlman
@ 2026-04-01 10:22     ` Simon Xue
  2026-04-03  4:40       ` Simon Xue
  0 siblings, 1 reply; 8+ messages in thread
From: Simon Xue @ 2026-04-01 10:22 UTC (permalink / raw)
  To: Jonas Karlman, Midgy BALON
  Cc: iommu, joro, will, robin.murphy, heiko, linux-arm-kernel,
	linux-rockchip, linux-kernel, stable

Hi Jonas,

在 2026/4/1 16:41, Jonas Karlman 写道:
> Hi Simon,
>
> On 4/1/2026 9:48 AM, Simon wrote:
>> Hi Midgy,
>>
>> 在 2026/3/31 15:50, Midgy BALON 写道:
>>> commit 2a7e6400f72b ("iommu: rockchip: Allocate tables from all
>>> available memory for IOMMU v2") removed GFP_DMA32 from
>>> iommu_data_ops_v2, reasoning that RK356x and RK3588 IOMMU v2 hardware
>>> supports up to 40-bit physical addresses for page tables.  However, the
>>> RK3568 IOMMU page-table walker uses a 32-bit AXI bus: it cannot access
>>> physical addresses above 4 GB regardless of the address encoding range.
>>>
>>> On boards with more than 4 GB of RAM (e.g. 8 GB LPDDR4X), removing
>>> GFP_DMA32 causes two distinct failure modes:
>>>
>>> 1. Direct allocation above 4 GB: iommu_alloc_pages_sz() may return
>>>      memory above 0x100000000.  The hardware page-table walker issues a
>>>      bus error trying to dereference those addresses, causing an IOMMU
>>>      fault on the first DMA transaction.
>> Which IP block is hitting this? We'd like to take a look on our end.
> I have seen reports that the NPU MMU on RK3568/RK3566 is having some
> issue using DTE/PTE with >32-bit addresses, maybe it uses a different
> MMU hw revision or has some hw errata?
>
>  From my own testing at least the VOP2 MMU on RK3568 (and RK3588) was
> able to handle 40-bit addressable DTE/PTE, hence the original commit
> 2a7e6400f72b ("iommu: rockchip: Allocate tables from all available
> memory for IOMMU v2").
>
> As also mentioned in my reply at [1], maybe the NPU MMU has some hw
> limitation or errata and may need to use a different compatible.

Yes,  We are checking internally whether different IOMMU versions 
integrated.

I will share what we find once we have results.

> [1] https://lore.kernel.org/r/3cd63b3d-1c5e-4a11-856e-c4aeb5d97d55@kwiboo.se/
>
> Regards,
> Jonas
>
>>> 2. SWIOTLB bounce-buffer poisoning: without GFP_DMA32, page tables land
>>>      above the SWIOTLB window.  dma_map_single() with DMA_BIT_MASK(32)
>>>      then bounces them into a buffer below 4 GB.  rk_dte_get_page_table()
>>>      returns phys_to_virt() of the bounce buffer address; PTEs are written
>>>      there; the next dma_sync_single_for_device(DMA_TO_DEVICE) copies the
>>>      original (zero) data back over the bounce buffer, silently erasing the
>>>      freshly written PTEs.  The IOMMU faults because every PTE reads as zero.
>> This probably need a separate patch. One way to fix it would be to track the
>> original L2 page table base addresses in struct rk_iommu_domain,
>> then have rk_dte_get_page_table() return the tracked address instead of
>> deriving it from the DTE.
>>
>>> Restore GFP_DMA32 (and DMA_BIT_MASK(32)) for iommu_data_ops_v2, which
>>> currently only serves "rockchip,rk3568-iommu" in mainline.
>>>
>>> Tested on Radxa ROCK 3B (RK3568, 8 GB LPDDR4X):
>>>     - MobileNetV1 via RKNN: 5.8 ms/inference (IOMMU mode)
>>>     - YOLOv5s 640x640 via RKNN: ~57 ms/inference (IOMMU mode)
>>>     - No IOMMU faults, correct inference results
>>>
>>> Fixes: 2a7e6400f72b ("iommu: rockchip: Allocate tables from all available memory for IOMMU v2")
>>> Cc: stable@vger.kernel.org
>>> Cc: Jonas Karlman <jonas@kwiboo.se>
>>> Signed-off-by: Midgy BALON <midgy971@gmail.com>
>>> ---
>>>    drivers/iommu/rockchip-iommu.c | 4 ++--
>>>    1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
>>> index 85f3667e797..8b45db29471 100644
>>> --- a/drivers/iommu/rockchip-iommu.c
>>> +++ b/drivers/iommu/rockchip-iommu.c
>>> @@ -1358,8 +1358,8 @@ static struct rk_iommu_ops iommu_data_ops_v2 = {
>>>    	.pt_address = &rk_dte_pt_address_v2,
>>>    	.mk_dtentries = &rk_mk_dte_v2,
>>>    	.mk_ptentries = &rk_mk_pte_v2,
>>> -	.dma_bit_mask = DMA_BIT_MASK(40),
>>> -	.gfp_flags = 0,
>>> +	.dma_bit_mask = DMA_BIT_MASK(32),
>>> +	.gfp_flags = GFP_DMA32,
>>>    };
>>>    
>>>    static const struct of_device_id rk_iommu_dt_ids[] = {
>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] iommu/rockchip: fix page table allocation flags for v2 IOMMU
  2026-04-01 10:22     ` Simon Xue
@ 2026-04-03  4:40       ` Simon Xue
  2026-04-03 14:02         ` Midgy Balon
  0 siblings, 1 reply; 8+ messages in thread
From: Simon Xue @ 2026-04-03  4:40 UTC (permalink / raw)
  To: Jonas Karlman, Midgy BALON
  Cc: iommu, joro, will, robin.murphy, heiko, linux-arm-kernel,
	linux-rockchip, linux-kernel, stable


在 2026/4/1 18:22, Simon Xue 写道:
> Hi Jonas,
>
> 在 2026/4/1 16:41, Jonas Karlman 写道:
>> Hi Simon,
>>
>> On 4/1/2026 9:48 AM, Simon wrote:
>>> Hi Midgy,
>>>
>>> 在 2026/3/31 15:50, Midgy BALON 写道:
>>>> commit 2a7e6400f72b ("iommu: rockchip: Allocate tables from all
>>>> available memory for IOMMU v2") removed GFP_DMA32 from
>>>> iommu_data_ops_v2, reasoning that RK356x and RK3588 IOMMU v2 hardware
>>>> supports up to 40-bit physical addresses for page tables. However, the
>>>> RK3568 IOMMU page-table walker uses a 32-bit AXI bus: it cannot access
>>>> physical addresses above 4 GB regardless of the address encoding 
>>>> range.
>>>>
>>>> On boards with more than 4 GB of RAM (e.g. 8 GB LPDDR4X), removing
>>>> GFP_DMA32 causes two distinct failure modes:
>>>>
>>>> 1. Direct allocation above 4 GB: iommu_alloc_pages_sz() may return
>>>>      memory above 0x100000000.  The hardware page-table walker 
>>>> issues a
>>>>      bus error trying to dereference those addresses, causing an IOMMU
>>>>      fault on the first DMA transaction.
>>> Which IP block is hitting this? We'd like to take a look on our end.
>> I have seen reports that the NPU MMU on RK3568/RK3566 is having some
>> issue using DTE/PTE with >32-bit addresses, maybe it uses a different
>> MMU hw revision or has some hw errata?
>>
>>  From my own testing at least the VOP2 MMU on RK3568 (and RK3588) was
>> able to handle 40-bit addressable DTE/PTE, hence the original commit
>> 2a7e6400f72b ("iommu: rockchip: Allocate tables from all available
>> memory for IOMMU v2").
>>
>> As also mentioned in my reply at [1], maybe the NPU MMU has some hw
>> limitation or errata and may need to use a different compatible.
>
> Yes,  We are checking internally whether different IOMMU versions 
> integrated.
>
> I will share what we find once we have results.
>
We internally checked that the RK356x SoCs integrate two different IOMMU 
versions (v1.0 and v2.0), like NPU and ISP use the v1.0 IOMMU.

Both versions can map 40-bit physical pages, but v1.0 does not support 
placing the first-level page table above 4 GB.

To fix this, I think we need to land this patch first: 
https://lore.kernel.org/all/20260310105303.128859-1-xxm@rock-chips.com/

Then on top of that, we can add a new compatible string to distinguish 
the IOMMU versions.

>> [1] 
>> https://lore.kernel.org/r/3cd63b3d-1c5e-4a11-856e-c4aeb5d97d55@kwiboo.se/
>>
>> Regards,
>> Jonas
>>
>>>> 2. SWIOTLB bounce-buffer poisoning: without GFP_DMA32, page tables 
>>>> land
>>>>      above the SWIOTLB window.  dma_map_single() with DMA_BIT_MASK(32)
>>>>      then bounces them into a buffer below 4 GB. 
>>>> rk_dte_get_page_table()
>>>>      returns phys_to_virt() of the bounce buffer address; PTEs are 
>>>> written
>>>>      there; the next dma_sync_single_for_device(DMA_TO_DEVICE) 
>>>> copies the
>>>>      original (zero) data back over the bounce buffer, silently 
>>>> erasing the
>>>>      freshly written PTEs.  The IOMMU faults because every PTE 
>>>> reads as zero.
>>> This probably need a separate patch. One way to fix it would be to 
>>> track the
>>> original L2 page table base addresses in struct rk_iommu_domain,
>>> then have rk_dte_get_page_table() return the tracked address instead of
>>> deriving it from the DTE.
>>>
>>>> Restore GFP_DMA32 (and DMA_BIT_MASK(32)) for iommu_data_ops_v2, which
>>>> currently only serves "rockchip,rk3568-iommu" in mainline.
>>>>
>>>> Tested on Radxa ROCK 3B (RK3568, 8 GB LPDDR4X):
>>>>     - MobileNetV1 via RKNN: 5.8 ms/inference (IOMMU mode)
>>>>     - YOLOv5s 640x640 via RKNN: ~57 ms/inference (IOMMU mode)
>>>>     - No IOMMU faults, correct inference results
>>>>
>>>> Fixes: 2a7e6400f72b ("iommu: rockchip: Allocate tables from all 
>>>> available memory for IOMMU v2")
>>>> Cc: stable@vger.kernel.org
>>>> Cc: Jonas Karlman <jonas@kwiboo.se>
>>>> Signed-off-by: Midgy BALON <midgy971@gmail.com>
>>>> ---
>>>>    drivers/iommu/rockchip-iommu.c | 4 ++--
>>>>    1 file changed, 2 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/iommu/rockchip-iommu.c 
>>>> b/drivers/iommu/rockchip-iommu.c
>>>> index 85f3667e797..8b45db29471 100644
>>>> --- a/drivers/iommu/rockchip-iommu.c
>>>> +++ b/drivers/iommu/rockchip-iommu.c
>>>> @@ -1358,8 +1358,8 @@ static struct rk_iommu_ops iommu_data_ops_v2 = {
>>>>        .pt_address = &rk_dte_pt_address_v2,
>>>>        .mk_dtentries = &rk_mk_dte_v2,
>>>>        .mk_ptentries = &rk_mk_pte_v2,
>>>> -    .dma_bit_mask = DMA_BIT_MASK(40),
>>>> -    .gfp_flags = 0,
>>>> +    .dma_bit_mask = DMA_BIT_MASK(32),
>>>> +    .gfp_flags = GFP_DMA32,
>>>>    };
>>>>       static const struct of_device_id rk_iommu_dt_ids[] = {
>>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] iommu/rockchip: fix page table allocation flags for v2 IOMMU
  2026-04-03  4:40       ` Simon Xue
@ 2026-04-03 14:02         ` Midgy Balon
  0 siblings, 0 replies; 8+ messages in thread
From: Midgy Balon @ 2026-04-03 14:02 UTC (permalink / raw)
  To: Simon Xue
  Cc: Jonas Karlman, iommu, joro, will, robin.murphy, Heiko Stuebner,
	linux-arm-kernel, linux-rockchip, linux-kernel, stable

 From: Midgy BALON <midgy971@gmail.com>
 To: Simon Xue <xxm@rock-chips.com>
 Cc: Jonas Karlman <jonas@kwiboo.se>, iommu@lists.linux.dev,
     joro@8bytes.org, will@kernel.org, robin.murphy@arm.com,
     heiko@sntech.de, linux-arm-kernel@lists.infradead.org,
     linux-rockchip@lists.infradead.org, linux-kernel@vger.kernel.org,
     stable@vger.kernel.org
 In-Reply-To: <5663593b-2c53-4632-ad2c-db9efa8e9ab2@rock-chips.com>
 References: <20260331075010.1463-1-midgy971@gmail.com>
             <0f285782-b12a-4abd-bca7-b6c549bed59f@rock-chips.com>
             <e622cc9e-8fb0-454a-b88e-dc13cf2ff507@kwiboo.se>
             <89ed223d-1a2c-447d-9f21-76969e668855@rock-chips.com>
             <5663593b-2c53-4632-ad2c-db9efa8e9ab2@rock-chips.com>
 Subject: Re: [PATCH] iommu/rockchip: fix page table allocation flags
for v2 IOMMU

 On 4/3/2026, Simon Xue wrote:
 > We internally checked that the RK356x SoCs integrate two different
 > IOMMU versions (v1.0 and v2.0), like NPU and ISP use the v1.0 IOMMU.
 >
 > Both versions can map 40-bit physical pages, but v1.0 does not support
 > placing the first-level page table above 4 GB.
 >
 > To fix this, I think we need to land this patch first:
 > https://lore.kernel.org/all/20260310105303.128859-1-xxm@rock-chips.com/
 >
 > Then on top of that, we can add a new compatible string to distinguish
 > the IOMMU versions.

 Thank you Simon and Jonas for the internal investigation. This explains
 exactly what I observed.

 To answer Simon's earlier question: the IP block hitting both failure
 modes is the NPU IOMMU (rknpu_mmu, at 0xfde4b000), currently bound
 to "rockchip,rk3568-iommu" in rk356x-base.dtsi. Both the downstream
 rknpu driver and the upstream Rocket accel driver (drivers/accel/rocket/)
 use this IOMMU.

 The v1.0 first-level page table constraint explains both failure modes.
 On boards with more than 4 GB of RAM the DTE table can be allocated
 above 0x100000000, and the v1.0 hardware silently truncates or errors
 on that address. The SWIOTLB bounce-buffer path is a consequence of
 the same root cause: with DMA_BIT_MASK(32) on the NPU device, bounce
 buffers land below 4 GB, phys_to_virt() of the bounce address is then
 used as the PTE write target, and the subsequent
 dma_sync_single_for_device(DMA_TO_DEVICE) overwrites those PTEs with
 zeros from the original buffer.

 Please consider my original patch withdrawn. Modifying iommu_data_ops_v2
 was too broad and would have incorrectly constrained VOP2 MMU and all
 other v2 IOMMU users.

 I agree fully with the two-step approach. On top of your per-device-ops
 patch [1], I plan to send:

   [1/2] iommu/rockchip: Add "rockchip,rk3568-iommu-v1" compatible
         for IOMMU v1.0 blocks (NPU, ISP/VICAP) on RK3568
         — ops with .gfp_flags = GFP_DMA32,
                .dma_bit_mask = DMA_BIT_MASK(40)
         (v1.0 can still map 40-bit physical pages; only the DTE
         table base must be below 4 GB)
   [2/2] arm64: dts: rockchip: rk356x: Use "rockchip,rk3568-iommu-v1"
         for rknpu_mmu (0xfde4b000) and vicap_mmu (0xfdfe0800)

 One note on the SWIOTLB issue: with GFP_DMA32 in the new ops, page
 table allocations never reach SWIOTLB, so the "track L2 base addresses"
 approach you suggested should not be necessary — GFP_DMA32 prevents the
 bounce-buffer poisoning at the source. Happy to be corrected if there
 is another path where it is still needed.

 I am happy to add Tested-by to your per-device-ops patch [1].

 [1] https://lore.kernel.org/all/20260310105303.128859-1-xxm@rock-chips.com/

 Regards,
 Midgy BALON

Le ven. 3 avr. 2026 à 06:40, Simon Xue <xxm@rock-chips.com> a écrit :
>
>
> 在 2026/4/1 18:22, Simon Xue 写道:
> > Hi Jonas,
> >
> > 在 2026/4/1 16:41, Jonas Karlman 写道:
> >> Hi Simon,
> >>
> >> On 4/1/2026 9:48 AM, Simon wrote:
> >>> Hi Midgy,
> >>>
> >>> 在 2026/3/31 15:50, Midgy BALON 写道:
> >>>> commit 2a7e6400f72b ("iommu: rockchip: Allocate tables from all
> >>>> available memory for IOMMU v2") removed GFP_DMA32 from
> >>>> iommu_data_ops_v2, reasoning that RK356x and RK3588 IOMMU v2 hardware
> >>>> supports up to 40-bit physical addresses for page tables. However, the
> >>>> RK3568 IOMMU page-table walker uses a 32-bit AXI bus: it cannot access
> >>>> physical addresses above 4 GB regardless of the address encoding
> >>>> range.
> >>>>
> >>>> On boards with more than 4 GB of RAM (e.g. 8 GB LPDDR4X), removing
> >>>> GFP_DMA32 causes two distinct failure modes:
> >>>>
> >>>> 1. Direct allocation above 4 GB: iommu_alloc_pages_sz() may return
> >>>>      memory above 0x100000000.  The hardware page-table walker
> >>>> issues a
> >>>>      bus error trying to dereference those addresses, causing an IOMMU
> >>>>      fault on the first DMA transaction.
> >>> Which IP block is hitting this? We'd like to take a look on our end.
> >> I have seen reports that the NPU MMU on RK3568/RK3566 is having some
> >> issue using DTE/PTE with >32-bit addresses, maybe it uses a different
> >> MMU hw revision or has some hw errata?
> >>
> >>  From my own testing at least the VOP2 MMU on RK3568 (and RK3588) was
> >> able to handle 40-bit addressable DTE/PTE, hence the original commit
> >> 2a7e6400f72b ("iommu: rockchip: Allocate tables from all available
> >> memory for IOMMU v2").
> >>
> >> As also mentioned in my reply at [1], maybe the NPU MMU has some hw
> >> limitation or errata and may need to use a different compatible.
> >
> > Yes,  We are checking internally whether different IOMMU versions
> > integrated.
> >
> > I will share what we find once we have results.
> >
> We internally checked that the RK356x SoCs integrate two different IOMMU
> versions (v1.0 and v2.0), like NPU and ISP use the v1.0 IOMMU.
>
> Both versions can map 40-bit physical pages, but v1.0 does not support
> placing the first-level page table above 4 GB.
>
> To fix this, I think we need to land this patch first:
> https://lore.kernel.org/all/20260310105303.128859-1-xxm@rock-chips.com/
>
> Then on top of that, we can add a new compatible string to distinguish
> the IOMMU versions.
>
> >> [1]
> >> https://lore.kernel.org/r/3cd63b3d-1c5e-4a11-856e-c4aeb5d97d55@kwiboo.se/
> >>
> >> Regards,
> >> Jonas
> >>
> >>>> 2. SWIOTLB bounce-buffer poisoning: without GFP_DMA32, page tables
> >>>> land
> >>>>      above the SWIOTLB window.  dma_map_single() with DMA_BIT_MASK(32)
> >>>>      then bounces them into a buffer below 4 GB.
> >>>> rk_dte_get_page_table()
> >>>>      returns phys_to_virt() of the bounce buffer address; PTEs are
> >>>> written
> >>>>      there; the next dma_sync_single_for_device(DMA_TO_DEVICE)
> >>>> copies the
> >>>>      original (zero) data back over the bounce buffer, silently
> >>>> erasing the
> >>>>      freshly written PTEs.  The IOMMU faults because every PTE
> >>>> reads as zero.
> >>> This probably need a separate patch. One way to fix it would be to
> >>> track the
> >>> original L2 page table base addresses in struct rk_iommu_domain,
> >>> then have rk_dte_get_page_table() return the tracked address instead of
> >>> deriving it from the DTE.
> >>>
> >>>> Restore GFP_DMA32 (and DMA_BIT_MASK(32)) for iommu_data_ops_v2, which
> >>>> currently only serves "rockchip,rk3568-iommu" in mainline.
> >>>>
> >>>> Tested on Radxa ROCK 3B (RK3568, 8 GB LPDDR4X):
> >>>>     - MobileNetV1 via RKNN: 5.8 ms/inference (IOMMU mode)
> >>>>     - YOLOv5s 640x640 via RKNN: ~57 ms/inference (IOMMU mode)
> >>>>     - No IOMMU faults, correct inference results
> >>>>
> >>>> Fixes: 2a7e6400f72b ("iommu: rockchip: Allocate tables from all
> >>>> available memory for IOMMU v2")
> >>>> Cc: stable@vger.kernel.org
> >>>> Cc: Jonas Karlman <jonas@kwiboo.se>
> >>>> Signed-off-by: Midgy BALON <midgy971@gmail.com>
> >>>> ---
> >>>>    drivers/iommu/rockchip-iommu.c | 4 ++--
> >>>>    1 file changed, 2 insertions(+), 2 deletions(-)
> >>>>
> >>>> diff --git a/drivers/iommu/rockchip-iommu.c
> >>>> b/drivers/iommu/rockchip-iommu.c
> >>>> index 85f3667e797..8b45db29471 100644
> >>>> --- a/drivers/iommu/rockchip-iommu.c
> >>>> +++ b/drivers/iommu/rockchip-iommu.c
> >>>> @@ -1358,8 +1358,8 @@ static struct rk_iommu_ops iommu_data_ops_v2 = {
> >>>>        .pt_address = &rk_dte_pt_address_v2,
> >>>>        .mk_dtentries = &rk_mk_dte_v2,
> >>>>        .mk_ptentries = &rk_mk_pte_v2,
> >>>> -    .dma_bit_mask = DMA_BIT_MASK(40),
> >>>> -    .gfp_flags = 0,
> >>>> +    .dma_bit_mask = DMA_BIT_MASK(32),
> >>>> +    .gfp_flags = GFP_DMA32,
> >>>>    };
> >>>>       static const struct of_device_id rk_iommu_dt_ids[] = {
> >>


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2026-04-03 14:00 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-31  7:50 [PATCH] iommu/rockchip: fix page table allocation flags for v2 IOMMU Midgy BALON
2026-03-31  7:57 ` Shawn Lin
2026-03-31 18:13 ` Jonas Karlman
2026-04-01  7:48 ` Simon
2026-04-01  8:41   ` Jonas Karlman
2026-04-01 10:22     ` Simon Xue
2026-04-03  4:40       ` Simon Xue
2026-04-03 14:02         ` Midgy Balon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox