* [PATCH 0/4] MTK_IOMMU: Optimize mapping / unmapping performance
@ 2020-10-19 11:30 ` Chao Hao
0 siblings, 0 replies; 36+ messages in thread
From: Chao Hao @ 2020-10-19 11:30 UTC (permalink / raw)
To: Joerg Roedel, Matthias Brugger
Cc: Jun Wen, FY Yang, wsd_upstream, linux-kernel, Chao Hao, iommu,
linux-mediatek, linux-arm-kernel, Mingyuan Ma
For MTK platforms, mtk_iommu is using iotlb_sync(), tlb_add_range() and tlb_flush_walk/leaf()
to do tlb sync when iommu driver runs iova mapping/unmapping. But if buffer size is large,
it maybe consist of many pages(4K/8K/64K/1MB......). So iommu driver maybe run many times tlb
sync in mapping for this case and it will degrade performance seriously. In order to resolve the
issue, we hope to add iotlb_sync_range() callback in iommu_ops, it can appiont iova and size to
do tlb sync. MTK_IOMMU will use iotlb_sync_range() callback when the whole mapping/unmapping is
completed and remove iotlb_sync(), tlb_add_range() and tlb_flush_walk/leaf().
So this patchset will replace iotlb_sync(), tlb_add_range() and tlb_flush_walk/leaf() with
iotlb_sync_range() callback.
Chao Hao (4):
iommu: Introduce iotlb_sync_range callback
iommu/mediatek: Add iotlb_sync_range() support
iommu/mediatek: Remove unnecessary tlb sync
iommu/mediatek: Adjust iotlb_sync_range
drivers/iommu/dma-iommu.c | 9 +++++++++
drivers/iommu/iommu.c | 7 +++++++
drivers/iommu/mtk_iommu.c | 36 ++++++++----------------------------
include/linux/iommu.h | 2 ++
4 files changed, 26 insertions(+), 28 deletions(-)
--
2.18.0
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
^ permalink raw reply [flat|nested] 36+ messages in thread* [PATCH 0/4] MTK_IOMMU: Optimize mapping / unmapping performance @ 2020-10-19 11:30 ` Chao Hao 0 siblings, 0 replies; 36+ messages in thread From: Chao Hao @ 2020-10-19 11:30 UTC (permalink / raw) To: Joerg Roedel, Matthias Brugger Cc: iommu, linux-kernel, linux-arm-kernel, linux-mediatek, wsd_upstream, Yong Wu, FY Yang, Jun Wen, Mingyuan Ma, Chao Hao For MTK platforms, mtk_iommu is using iotlb_sync(), tlb_add_range() and tlb_flush_walk/leaf() to do tlb sync when iommu driver runs iova mapping/unmapping. But if buffer size is large, it maybe consist of many pages(4K/8K/64K/1MB......). So iommu driver maybe run many times tlb sync in mapping for this case and it will degrade performance seriously. In order to resolve the issue, we hope to add iotlb_sync_range() callback in iommu_ops, it can appiont iova and size to do tlb sync. MTK_IOMMU will use iotlb_sync_range() callback when the whole mapping/unmapping is completed and remove iotlb_sync(), tlb_add_range() and tlb_flush_walk/leaf(). So this patchset will replace iotlb_sync(), tlb_add_range() and tlb_flush_walk/leaf() with iotlb_sync_range() callback. Chao Hao (4): iommu: Introduce iotlb_sync_range callback iommu/mediatek: Add iotlb_sync_range() support iommu/mediatek: Remove unnecessary tlb sync iommu/mediatek: Adjust iotlb_sync_range drivers/iommu/dma-iommu.c | 9 +++++++++ drivers/iommu/iommu.c | 7 +++++++ drivers/iommu/mtk_iommu.c | 36 ++++++++---------------------------- include/linux/iommu.h | 2 ++ 4 files changed, 26 insertions(+), 28 deletions(-) -- 2.18.0 ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH 0/4] MTK_IOMMU: Optimize mapping / unmapping performance @ 2020-10-19 11:30 ` Chao Hao 0 siblings, 0 replies; 36+ messages in thread From: Chao Hao @ 2020-10-19 11:30 UTC (permalink / raw) To: Joerg Roedel, Matthias Brugger Cc: Jun Wen, FY Yang, wsd_upstream, linux-kernel, Chao Hao, iommu, linux-mediatek, linux-arm-kernel, Mingyuan Ma, Yong Wu For MTK platforms, mtk_iommu is using iotlb_sync(), tlb_add_range() and tlb_flush_walk/leaf() to do tlb sync when iommu driver runs iova mapping/unmapping. But if buffer size is large, it maybe consist of many pages(4K/8K/64K/1MB......). So iommu driver maybe run many times tlb sync in mapping for this case and it will degrade performance seriously. In order to resolve the issue, we hope to add iotlb_sync_range() callback in iommu_ops, it can appiont iova and size to do tlb sync. MTK_IOMMU will use iotlb_sync_range() callback when the whole mapping/unmapping is completed and remove iotlb_sync(), tlb_add_range() and tlb_flush_walk/leaf(). So this patchset will replace iotlb_sync(), tlb_add_range() and tlb_flush_walk/leaf() with iotlb_sync_range() callback. Chao Hao (4): iommu: Introduce iotlb_sync_range callback iommu/mediatek: Add iotlb_sync_range() support iommu/mediatek: Remove unnecessary tlb sync iommu/mediatek: Adjust iotlb_sync_range drivers/iommu/dma-iommu.c | 9 +++++++++ drivers/iommu/iommu.c | 7 +++++++ drivers/iommu/mtk_iommu.c | 36 ++++++++---------------------------- include/linux/iommu.h | 2 ++ 4 files changed, 26 insertions(+), 28 deletions(-) -- 2.18.0 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH 0/4] MTK_IOMMU: Optimize mapping / unmapping performance @ 2020-10-19 11:30 ` Chao Hao 0 siblings, 0 replies; 36+ messages in thread From: Chao Hao @ 2020-10-19 11:30 UTC (permalink / raw) To: Joerg Roedel, Matthias Brugger Cc: Jun Wen, FY Yang, wsd_upstream, linux-kernel, Chao Hao, iommu, linux-mediatek, linux-arm-kernel, Mingyuan Ma, Yong Wu For MTK platforms, mtk_iommu is using iotlb_sync(), tlb_add_range() and tlb_flush_walk/leaf() to do tlb sync when iommu driver runs iova mapping/unmapping. But if buffer size is large, it maybe consist of many pages(4K/8K/64K/1MB......). So iommu driver maybe run many times tlb sync in mapping for this case and it will degrade performance seriously. In order to resolve the issue, we hope to add iotlb_sync_range() callback in iommu_ops, it can appiont iova and size to do tlb sync. MTK_IOMMU will use iotlb_sync_range() callback when the whole mapping/unmapping is completed and remove iotlb_sync(), tlb_add_range() and tlb_flush_walk/leaf(). So this patchset will replace iotlb_sync(), tlb_add_range() and tlb_flush_walk/leaf() with iotlb_sync_range() callback. Chao Hao (4): iommu: Introduce iotlb_sync_range callback iommu/mediatek: Add iotlb_sync_range() support iommu/mediatek: Remove unnecessary tlb sync iommu/mediatek: Adjust iotlb_sync_range drivers/iommu/dma-iommu.c | 9 +++++++++ drivers/iommu/iommu.c | 7 +++++++ drivers/iommu/mtk_iommu.c | 36 ++++++++---------------------------- include/linux/iommu.h | 2 ++ 4 files changed, 26 insertions(+), 28 deletions(-) -- 2.18.0 _______________________________________________ Linux-mediatek mailing list Linux-mediatek@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-mediatek ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH 1/4] iommu: Introduce iotlb_sync_range callback 2020-10-19 11:30 ` Chao Hao (?) (?) @ 2020-10-19 11:30 ` Chao Hao -1 siblings, 0 replies; 36+ messages in thread From: Chao Hao @ 2020-10-19 11:30 UTC (permalink / raw) To: Joerg Roedel, Matthias Brugger Cc: Jun Wen, FY Yang, wsd_upstream, linux-kernel, Chao Hao, iommu, linux-mediatek, linux-arm-kernel, Mingyuan Ma Add iotlb_sync_range callback to support that driver can appoint iova and size to do tlb sync. Iommu will call iotlb_sync_range() after the whole mapping/unmapping is completed, and the iova and size of iotlb_sync_range() are start_iova and buffer total_size respectively. At the same time, iotlb_sync() and tlb_flush_walk/leaf() can be skipped. So iotlb_sync_range() will enhance performance by reducing the time of tlb sync. Signed-off-by: Chao Hao <chao.hao@mediatek.com> --- drivers/iommu/dma-iommu.c | 9 +++++++++ drivers/iommu/iommu.c | 7 +++++++ include/linux/iommu.h | 2 ++ 3 files changed, 18 insertions(+) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index 4959f5df21bd..e2e9114c4ae2 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -479,6 +479,7 @@ static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys, size_t size, int prot, u64 dma_mask) { struct iommu_domain *domain = iommu_get_dma_domain(dev); + const struct iommu_ops *ops = domain->ops; struct iommu_dma_cookie *cookie = domain->iova_cookie; struct iova_domain *iovad = &cookie->iovad; size_t iova_off = iova_offset(iovad, phys); @@ -497,6 +498,10 @@ static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys, iommu_dma_free_iova(cookie, iova, size); return DMA_MAPPING_ERROR; } + + if (ops->iotlb_sync_range) + ops->iotlb_sync_range(iova, size); + return iova + iova_off; } @@ -1165,6 +1170,7 @@ void iommu_setup_dma_ops(struct device *dev, u64 dma_base, u64 size) static struct iommu_dma_msi_page *iommu_dma_get_msi_page(struct device *dev, phys_addr_t msi_addr, struct iommu_domain *domain) { + const struct iommu_ops *ops = domain->ops; struct iommu_dma_cookie *cookie = domain->iova_cookie; struct iommu_dma_msi_page *msi_page; dma_addr_t iova; @@ -1187,6 +1193,9 @@ static struct iommu_dma_msi_page *iommu_dma_get_msi_page(struct device *dev, if (iommu_map(domain, iova, msi_addr, size, prot)) goto out_free_iova; + if (ops->iotlb_sync_range) + ops->iotlb_sync_range(iova, size); + INIT_LIST_HEAD(&msi_page->list); msi_page->phys = msi_addr; msi_page->iova = iova; diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 609bd25bf154..e399a238d1e9 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -2304,6 +2304,9 @@ static size_t __iommu_unmap(struct iommu_domain *domain, unmapped += unmapped_page; } + if (ops->iotlb_sync_range) + ops->iotlb_sync_range(iova, size); + trace_unmap(orig_iova, size, unmapped); return unmapped; } @@ -2334,6 +2337,7 @@ static size_t __iommu_map_sg(struct iommu_domain *domain, unsigned long iova, struct scatterlist *sg, unsigned int nents, int prot, gfp_t gfp) { + const struct iommu_ops *ops = domain->ops; size_t len = 0, mapped = 0; phys_addr_t start; unsigned int i = 0; @@ -2364,6 +2368,9 @@ static size_t __iommu_map_sg(struct iommu_domain *domain, unsigned long iova, sg = sg_next(sg); } + if (ops->iotlb_sync_range) + ops->iotlb_sync_range(iova, mapped); + return mapped; out_err: diff --git a/include/linux/iommu.h b/include/linux/iommu.h index fee209efb756..4be90324bd23 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -192,6 +192,7 @@ struct iommu_iotlb_gather { * @map: map a physically contiguous memory region to an iommu domain * @unmap: unmap a physically contiguous memory region from an iommu domain * @flush_iotlb_all: Synchronously flush all hardware TLBs for this domain + * @iotlb_sync_range: Sync specific iova and size mappings to the hardware * @iotlb_sync_map: Sync mappings created recently using @map to the hardware * @iotlb_sync: Flush all queued ranges from the hardware TLBs and empty flush * queue @@ -244,6 +245,7 @@ struct iommu_ops { size_t (*unmap)(struct iommu_domain *domain, unsigned long iova, size_t size, struct iommu_iotlb_gather *iotlb_gather); void (*flush_iotlb_all)(struct iommu_domain *domain); + void (*iotlb_sync_range)(unsigned long iova, size_t size); void (*iotlb_sync_map)(struct iommu_domain *domain); void (*iotlb_sync)(struct iommu_domain *domain, struct iommu_iotlb_gather *iotlb_gather); -- 2.18.0 _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 1/4] iommu: Introduce iotlb_sync_range callback @ 2020-10-19 11:30 ` Chao Hao 0 siblings, 0 replies; 36+ messages in thread From: Chao Hao @ 2020-10-19 11:30 UTC (permalink / raw) To: Joerg Roedel, Matthias Brugger Cc: iommu, linux-kernel, linux-arm-kernel, linux-mediatek, wsd_upstream, Yong Wu, FY Yang, Jun Wen, Mingyuan Ma, Chao Hao Add iotlb_sync_range callback to support that driver can appoint iova and size to do tlb sync. Iommu will call iotlb_sync_range() after the whole mapping/unmapping is completed, and the iova and size of iotlb_sync_range() are start_iova and buffer total_size respectively. At the same time, iotlb_sync() and tlb_flush_walk/leaf() can be skipped. So iotlb_sync_range() will enhance performance by reducing the time of tlb sync. Signed-off-by: Chao Hao <chao.hao@mediatek.com> --- drivers/iommu/dma-iommu.c | 9 +++++++++ drivers/iommu/iommu.c | 7 +++++++ include/linux/iommu.h | 2 ++ 3 files changed, 18 insertions(+) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index 4959f5df21bd..e2e9114c4ae2 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -479,6 +479,7 @@ static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys, size_t size, int prot, u64 dma_mask) { struct iommu_domain *domain = iommu_get_dma_domain(dev); + const struct iommu_ops *ops = domain->ops; struct iommu_dma_cookie *cookie = domain->iova_cookie; struct iova_domain *iovad = &cookie->iovad; size_t iova_off = iova_offset(iovad, phys); @@ -497,6 +498,10 @@ static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys, iommu_dma_free_iova(cookie, iova, size); return DMA_MAPPING_ERROR; } + + if (ops->iotlb_sync_range) + ops->iotlb_sync_range(iova, size); + return iova + iova_off; } @@ -1165,6 +1170,7 @@ void iommu_setup_dma_ops(struct device *dev, u64 dma_base, u64 size) static struct iommu_dma_msi_page *iommu_dma_get_msi_page(struct device *dev, phys_addr_t msi_addr, struct iommu_domain *domain) { + const struct iommu_ops *ops = domain->ops; struct iommu_dma_cookie *cookie = domain->iova_cookie; struct iommu_dma_msi_page *msi_page; dma_addr_t iova; @@ -1187,6 +1193,9 @@ static struct iommu_dma_msi_page *iommu_dma_get_msi_page(struct device *dev, if (iommu_map(domain, iova, msi_addr, size, prot)) goto out_free_iova; + if (ops->iotlb_sync_range) + ops->iotlb_sync_range(iova, size); + INIT_LIST_HEAD(&msi_page->list); msi_page->phys = msi_addr; msi_page->iova = iova; diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 609bd25bf154..e399a238d1e9 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -2304,6 +2304,9 @@ static size_t __iommu_unmap(struct iommu_domain *domain, unmapped += unmapped_page; } + if (ops->iotlb_sync_range) + ops->iotlb_sync_range(iova, size); + trace_unmap(orig_iova, size, unmapped); return unmapped; } @@ -2334,6 +2337,7 @@ static size_t __iommu_map_sg(struct iommu_domain *domain, unsigned long iova, struct scatterlist *sg, unsigned int nents, int prot, gfp_t gfp) { + const struct iommu_ops *ops = domain->ops; size_t len = 0, mapped = 0; phys_addr_t start; unsigned int i = 0; @@ -2364,6 +2368,9 @@ static size_t __iommu_map_sg(struct iommu_domain *domain, unsigned long iova, sg = sg_next(sg); } + if (ops->iotlb_sync_range) + ops->iotlb_sync_range(iova, mapped); + return mapped; out_err: diff --git a/include/linux/iommu.h b/include/linux/iommu.h index fee209efb756..4be90324bd23 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -192,6 +192,7 @@ struct iommu_iotlb_gather { * @map: map a physically contiguous memory region to an iommu domain * @unmap: unmap a physically contiguous memory region from an iommu domain * @flush_iotlb_all: Synchronously flush all hardware TLBs for this domain + * @iotlb_sync_range: Sync specific iova and size mappings to the hardware * @iotlb_sync_map: Sync mappings created recently using @map to the hardware * @iotlb_sync: Flush all queued ranges from the hardware TLBs and empty flush * queue @@ -244,6 +245,7 @@ struct iommu_ops { size_t (*unmap)(struct iommu_domain *domain, unsigned long iova, size_t size, struct iommu_iotlb_gather *iotlb_gather); void (*flush_iotlb_all)(struct iommu_domain *domain); + void (*iotlb_sync_range)(unsigned long iova, size_t size); void (*iotlb_sync_map)(struct iommu_domain *domain); void (*iotlb_sync)(struct iommu_domain *domain, struct iommu_iotlb_gather *iotlb_gather); -- 2.18.0 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 1/4] iommu: Introduce iotlb_sync_range callback @ 2020-10-19 11:30 ` Chao Hao 0 siblings, 0 replies; 36+ messages in thread From: Chao Hao @ 2020-10-19 11:30 UTC (permalink / raw) To: Joerg Roedel, Matthias Brugger Cc: Jun Wen, FY Yang, wsd_upstream, linux-kernel, Chao Hao, iommu, linux-mediatek, linux-arm-kernel, Mingyuan Ma, Yong Wu Add iotlb_sync_range callback to support that driver can appoint iova and size to do tlb sync. Iommu will call iotlb_sync_range() after the whole mapping/unmapping is completed, and the iova and size of iotlb_sync_range() are start_iova and buffer total_size respectively. At the same time, iotlb_sync() and tlb_flush_walk/leaf() can be skipped. So iotlb_sync_range() will enhance performance by reducing the time of tlb sync. Signed-off-by: Chao Hao <chao.hao@mediatek.com> --- drivers/iommu/dma-iommu.c | 9 +++++++++ drivers/iommu/iommu.c | 7 +++++++ include/linux/iommu.h | 2 ++ 3 files changed, 18 insertions(+) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index 4959f5df21bd..e2e9114c4ae2 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -479,6 +479,7 @@ static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys, size_t size, int prot, u64 dma_mask) { struct iommu_domain *domain = iommu_get_dma_domain(dev); + const struct iommu_ops *ops = domain->ops; struct iommu_dma_cookie *cookie = domain->iova_cookie; struct iova_domain *iovad = &cookie->iovad; size_t iova_off = iova_offset(iovad, phys); @@ -497,6 +498,10 @@ static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys, iommu_dma_free_iova(cookie, iova, size); return DMA_MAPPING_ERROR; } + + if (ops->iotlb_sync_range) + ops->iotlb_sync_range(iova, size); + return iova + iova_off; } @@ -1165,6 +1170,7 @@ void iommu_setup_dma_ops(struct device *dev, u64 dma_base, u64 size) static struct iommu_dma_msi_page *iommu_dma_get_msi_page(struct device *dev, phys_addr_t msi_addr, struct iommu_domain *domain) { + const struct iommu_ops *ops = domain->ops; struct iommu_dma_cookie *cookie = domain->iova_cookie; struct iommu_dma_msi_page *msi_page; dma_addr_t iova; @@ -1187,6 +1193,9 @@ static struct iommu_dma_msi_page *iommu_dma_get_msi_page(struct device *dev, if (iommu_map(domain, iova, msi_addr, size, prot)) goto out_free_iova; + if (ops->iotlb_sync_range) + ops->iotlb_sync_range(iova, size); + INIT_LIST_HEAD(&msi_page->list); msi_page->phys = msi_addr; msi_page->iova = iova; diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 609bd25bf154..e399a238d1e9 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -2304,6 +2304,9 @@ static size_t __iommu_unmap(struct iommu_domain *domain, unmapped += unmapped_page; } + if (ops->iotlb_sync_range) + ops->iotlb_sync_range(iova, size); + trace_unmap(orig_iova, size, unmapped); return unmapped; } @@ -2334,6 +2337,7 @@ static size_t __iommu_map_sg(struct iommu_domain *domain, unsigned long iova, struct scatterlist *sg, unsigned int nents, int prot, gfp_t gfp) { + const struct iommu_ops *ops = domain->ops; size_t len = 0, mapped = 0; phys_addr_t start; unsigned int i = 0; @@ -2364,6 +2368,9 @@ static size_t __iommu_map_sg(struct iommu_domain *domain, unsigned long iova, sg = sg_next(sg); } + if (ops->iotlb_sync_range) + ops->iotlb_sync_range(iova, mapped); + return mapped; out_err: diff --git a/include/linux/iommu.h b/include/linux/iommu.h index fee209efb756..4be90324bd23 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -192,6 +192,7 @@ struct iommu_iotlb_gather { * @map: map a physically contiguous memory region to an iommu domain * @unmap: unmap a physically contiguous memory region from an iommu domain * @flush_iotlb_all: Synchronously flush all hardware TLBs for this domain + * @iotlb_sync_range: Sync specific iova and size mappings to the hardware * @iotlb_sync_map: Sync mappings created recently using @map to the hardware * @iotlb_sync: Flush all queued ranges from the hardware TLBs and empty flush * queue @@ -244,6 +245,7 @@ struct iommu_ops { size_t (*unmap)(struct iommu_domain *domain, unsigned long iova, size_t size, struct iommu_iotlb_gather *iotlb_gather); void (*flush_iotlb_all)(struct iommu_domain *domain); + void (*iotlb_sync_range)(unsigned long iova, size_t size); void (*iotlb_sync_map)(struct iommu_domain *domain); void (*iotlb_sync)(struct iommu_domain *domain, struct iommu_iotlb_gather *iotlb_gather); -- 2.18.0 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 1/4] iommu: Introduce iotlb_sync_range callback @ 2020-10-19 11:30 ` Chao Hao 0 siblings, 0 replies; 36+ messages in thread From: Chao Hao @ 2020-10-19 11:30 UTC (permalink / raw) To: Joerg Roedel, Matthias Brugger Cc: Jun Wen, FY Yang, wsd_upstream, linux-kernel, Chao Hao, iommu, linux-mediatek, linux-arm-kernel, Mingyuan Ma, Yong Wu Add iotlb_sync_range callback to support that driver can appoint iova and size to do tlb sync. Iommu will call iotlb_sync_range() after the whole mapping/unmapping is completed, and the iova and size of iotlb_sync_range() are start_iova and buffer total_size respectively. At the same time, iotlb_sync() and tlb_flush_walk/leaf() can be skipped. So iotlb_sync_range() will enhance performance by reducing the time of tlb sync. Signed-off-by: Chao Hao <chao.hao@mediatek.com> --- drivers/iommu/dma-iommu.c | 9 +++++++++ drivers/iommu/iommu.c | 7 +++++++ include/linux/iommu.h | 2 ++ 3 files changed, 18 insertions(+) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index 4959f5df21bd..e2e9114c4ae2 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -479,6 +479,7 @@ static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys, size_t size, int prot, u64 dma_mask) { struct iommu_domain *domain = iommu_get_dma_domain(dev); + const struct iommu_ops *ops = domain->ops; struct iommu_dma_cookie *cookie = domain->iova_cookie; struct iova_domain *iovad = &cookie->iovad; size_t iova_off = iova_offset(iovad, phys); @@ -497,6 +498,10 @@ static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys, iommu_dma_free_iova(cookie, iova, size); return DMA_MAPPING_ERROR; } + + if (ops->iotlb_sync_range) + ops->iotlb_sync_range(iova, size); + return iova + iova_off; } @@ -1165,6 +1170,7 @@ void iommu_setup_dma_ops(struct device *dev, u64 dma_base, u64 size) static struct iommu_dma_msi_page *iommu_dma_get_msi_page(struct device *dev, phys_addr_t msi_addr, struct iommu_domain *domain) { + const struct iommu_ops *ops = domain->ops; struct iommu_dma_cookie *cookie = domain->iova_cookie; struct iommu_dma_msi_page *msi_page; dma_addr_t iova; @@ -1187,6 +1193,9 @@ static struct iommu_dma_msi_page *iommu_dma_get_msi_page(struct device *dev, if (iommu_map(domain, iova, msi_addr, size, prot)) goto out_free_iova; + if (ops->iotlb_sync_range) + ops->iotlb_sync_range(iova, size); + INIT_LIST_HEAD(&msi_page->list); msi_page->phys = msi_addr; msi_page->iova = iova; diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 609bd25bf154..e399a238d1e9 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -2304,6 +2304,9 @@ static size_t __iommu_unmap(struct iommu_domain *domain, unmapped += unmapped_page; } + if (ops->iotlb_sync_range) + ops->iotlb_sync_range(iova, size); + trace_unmap(orig_iova, size, unmapped); return unmapped; } @@ -2334,6 +2337,7 @@ static size_t __iommu_map_sg(struct iommu_domain *domain, unsigned long iova, struct scatterlist *sg, unsigned int nents, int prot, gfp_t gfp) { + const struct iommu_ops *ops = domain->ops; size_t len = 0, mapped = 0; phys_addr_t start; unsigned int i = 0; @@ -2364,6 +2368,9 @@ static size_t __iommu_map_sg(struct iommu_domain *domain, unsigned long iova, sg = sg_next(sg); } + if (ops->iotlb_sync_range) + ops->iotlb_sync_range(iova, mapped); + return mapped; out_err: diff --git a/include/linux/iommu.h b/include/linux/iommu.h index fee209efb756..4be90324bd23 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -192,6 +192,7 @@ struct iommu_iotlb_gather { * @map: map a physically contiguous memory region to an iommu domain * @unmap: unmap a physically contiguous memory region from an iommu domain * @flush_iotlb_all: Synchronously flush all hardware TLBs for this domain + * @iotlb_sync_range: Sync specific iova and size mappings to the hardware * @iotlb_sync_map: Sync mappings created recently using @map to the hardware * @iotlb_sync: Flush all queued ranges from the hardware TLBs and empty flush * queue @@ -244,6 +245,7 @@ struct iommu_ops { size_t (*unmap)(struct iommu_domain *domain, unsigned long iova, size_t size, struct iommu_iotlb_gather *iotlb_gather); void (*flush_iotlb_all)(struct iommu_domain *domain); + void (*iotlb_sync_range)(unsigned long iova, size_t size); void (*iotlb_sync_map)(struct iommu_domain *domain); void (*iotlb_sync)(struct iommu_domain *domain, struct iommu_iotlb_gather *iotlb_gather); -- 2.18.0 _______________________________________________ Linux-mediatek mailing list Linux-mediatek@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-mediatek ^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 2/4] iommu/mediatek: Add iotlb_sync_range() support 2020-10-19 11:30 ` Chao Hao (?) (?) @ 2020-10-19 11:30 ` Chao Hao -1 siblings, 0 replies; 36+ messages in thread From: Chao Hao @ 2020-10-19 11:30 UTC (permalink / raw) To: Joerg Roedel, Matthias Brugger Cc: Jun Wen, FY Yang, wsd_upstream, linux-kernel, Chao Hao, iommu, linux-mediatek, linux-arm-kernel, Mingyuan Ma MTK_IOMMU driver writes one page entry and does tlb flush at a time currently. More optimal would be to aggregate the writes and flush BUS buffer in the end. For 50MB buffer mapping, if mtk_iommu driver use iotlb_sync_range() instead of tlb_add_range() and tlb_flush_walk/leaf(), it can increase 50% performance or more(depending on size of every page size) in comparison to flushing after each page entry update. So we prefer to use iotlb_sync_range() to replace iotlb_sync(), tlb_add_range() and tlb_flush_walk/leaf() for MTK platforms. Signed-off-by: Chao Hao <chao.hao@mediatek.com> --- drivers/iommu/mtk_iommu.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index 785b228d39a6..d3400c15ff7b 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -224,6 +224,11 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size, } } +static void __mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size) +{ + mtk_iommu_tlb_flush_range_sync(iova, size, 0, NULL) +} + static void mtk_iommu_tlb_flush_page_nosync(struct iommu_iotlb_gather *gather, unsigned long iova, size_t granule, void *cookie) @@ -536,6 +541,7 @@ static const struct iommu_ops mtk_iommu_ops = { .map = mtk_iommu_map, .unmap = mtk_iommu_unmap, .flush_iotlb_all = mtk_iommu_flush_iotlb_all, + .iotlb_sync_range = __mtk_iommu_tlb_flush_range_sync, .iotlb_sync = mtk_iommu_iotlb_sync, .iova_to_phys = mtk_iommu_iova_to_phys, .probe_device = mtk_iommu_probe_device, -- 2.18.0 _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 2/4] iommu/mediatek: Add iotlb_sync_range() support @ 2020-10-19 11:30 ` Chao Hao 0 siblings, 0 replies; 36+ messages in thread From: Chao Hao @ 2020-10-19 11:30 UTC (permalink / raw) To: Joerg Roedel, Matthias Brugger Cc: iommu, linux-kernel, linux-arm-kernel, linux-mediatek, wsd_upstream, Yong Wu, FY Yang, Jun Wen, Mingyuan Ma, Chao Hao MTK_IOMMU driver writes one page entry and does tlb flush at a time currently. More optimal would be to aggregate the writes and flush BUS buffer in the end. For 50MB buffer mapping, if mtk_iommu driver use iotlb_sync_range() instead of tlb_add_range() and tlb_flush_walk/leaf(), it can increase 50% performance or more(depending on size of every page size) in comparison to flushing after each page entry update. So we prefer to use iotlb_sync_range() to replace iotlb_sync(), tlb_add_range() and tlb_flush_walk/leaf() for MTK platforms. Signed-off-by: Chao Hao <chao.hao@mediatek.com> --- drivers/iommu/mtk_iommu.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index 785b228d39a6..d3400c15ff7b 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -224,6 +224,11 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size, } } +static void __mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size) +{ + mtk_iommu_tlb_flush_range_sync(iova, size, 0, NULL) +} + static void mtk_iommu_tlb_flush_page_nosync(struct iommu_iotlb_gather *gather, unsigned long iova, size_t granule, void *cookie) @@ -536,6 +541,7 @@ static const struct iommu_ops mtk_iommu_ops = { .map = mtk_iommu_map, .unmap = mtk_iommu_unmap, .flush_iotlb_all = mtk_iommu_flush_iotlb_all, + .iotlb_sync_range = __mtk_iommu_tlb_flush_range_sync, .iotlb_sync = mtk_iommu_iotlb_sync, .iova_to_phys = mtk_iommu_iova_to_phys, .probe_device = mtk_iommu_probe_device, -- 2.18.0 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 2/4] iommu/mediatek: Add iotlb_sync_range() support @ 2020-10-19 11:30 ` Chao Hao 0 siblings, 0 replies; 36+ messages in thread From: Chao Hao @ 2020-10-19 11:30 UTC (permalink / raw) To: Joerg Roedel, Matthias Brugger Cc: Jun Wen, FY Yang, wsd_upstream, linux-kernel, Chao Hao, iommu, linux-mediatek, linux-arm-kernel, Mingyuan Ma, Yong Wu MTK_IOMMU driver writes one page entry and does tlb flush at a time currently. More optimal would be to aggregate the writes and flush BUS buffer in the end. For 50MB buffer mapping, if mtk_iommu driver use iotlb_sync_range() instead of tlb_add_range() and tlb_flush_walk/leaf(), it can increase 50% performance or more(depending on size of every page size) in comparison to flushing after each page entry update. So we prefer to use iotlb_sync_range() to replace iotlb_sync(), tlb_add_range() and tlb_flush_walk/leaf() for MTK platforms. Signed-off-by: Chao Hao <chao.hao@mediatek.com> --- drivers/iommu/mtk_iommu.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index 785b228d39a6..d3400c15ff7b 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -224,6 +224,11 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size, } } +static void __mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size) +{ + mtk_iommu_tlb_flush_range_sync(iova, size, 0, NULL) +} + static void mtk_iommu_tlb_flush_page_nosync(struct iommu_iotlb_gather *gather, unsigned long iova, size_t granule, void *cookie) @@ -536,6 +541,7 @@ static const struct iommu_ops mtk_iommu_ops = { .map = mtk_iommu_map, .unmap = mtk_iommu_unmap, .flush_iotlb_all = mtk_iommu_flush_iotlb_all, + .iotlb_sync_range = __mtk_iommu_tlb_flush_range_sync, .iotlb_sync = mtk_iommu_iotlb_sync, .iova_to_phys = mtk_iommu_iova_to_phys, .probe_device = mtk_iommu_probe_device, -- 2.18.0 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 2/4] iommu/mediatek: Add iotlb_sync_range() support @ 2020-10-19 11:30 ` Chao Hao 0 siblings, 0 replies; 36+ messages in thread From: Chao Hao @ 2020-10-19 11:30 UTC (permalink / raw) To: Joerg Roedel, Matthias Brugger Cc: Jun Wen, FY Yang, wsd_upstream, linux-kernel, Chao Hao, iommu, linux-mediatek, linux-arm-kernel, Mingyuan Ma, Yong Wu MTK_IOMMU driver writes one page entry and does tlb flush at a time currently. More optimal would be to aggregate the writes and flush BUS buffer in the end. For 50MB buffer mapping, if mtk_iommu driver use iotlb_sync_range() instead of tlb_add_range() and tlb_flush_walk/leaf(), it can increase 50% performance or more(depending on size of every page size) in comparison to flushing after each page entry update. So we prefer to use iotlb_sync_range() to replace iotlb_sync(), tlb_add_range() and tlb_flush_walk/leaf() for MTK platforms. Signed-off-by: Chao Hao <chao.hao@mediatek.com> --- drivers/iommu/mtk_iommu.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index 785b228d39a6..d3400c15ff7b 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -224,6 +224,11 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size, } } +static void __mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size) +{ + mtk_iommu_tlb_flush_range_sync(iova, size, 0, NULL) +} + static void mtk_iommu_tlb_flush_page_nosync(struct iommu_iotlb_gather *gather, unsigned long iova, size_t granule, void *cookie) @@ -536,6 +541,7 @@ static const struct iommu_ops mtk_iommu_ops = { .map = mtk_iommu_map, .unmap = mtk_iommu_unmap, .flush_iotlb_all = mtk_iommu_flush_iotlb_all, + .iotlb_sync_range = __mtk_iommu_tlb_flush_range_sync, .iotlb_sync = mtk_iommu_iotlb_sync, .iova_to_phys = mtk_iommu_iova_to_phys, .probe_device = mtk_iommu_probe_device, -- 2.18.0 _______________________________________________ Linux-mediatek mailing list Linux-mediatek@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-mediatek ^ permalink raw reply related [flat|nested] 36+ messages in thread
* Re: [PATCH 2/4] iommu/mediatek: Add iotlb_sync_range() support 2020-10-19 11:30 ` Chao Hao (?) (?) @ 2020-10-21 16:55 ` Robin Murphy -1 siblings, 0 replies; 36+ messages in thread From: Robin Murphy @ 2020-10-21 16:55 UTC (permalink / raw) To: Chao Hao, Joerg Roedel, Matthias Brugger Cc: Jun Wen, FY Yang, wsd_upstream, linux-kernel, iommu, linux-mediatek, Mingyuan Ma, linux-arm-kernel On 2020-10-19 12:30, Chao Hao wrote: > MTK_IOMMU driver writes one page entry and does tlb flush at a time > currently. More optimal would be to aggregate the writes and flush > BUS buffer in the end. That's exactly what iommu_iotlb_gather_add_page() is meant to achieve. Rather than jumping straight into hacking up a new API to go round the back of the existing API design, it would be far better to ask the question of why that's not behaving as expected. > For 50MB buffer mapping, if mtk_iommu driver use iotlb_sync_range() > instead of tlb_add_range() and tlb_flush_walk/leaf(), it can increase > 50% performance or more(depending on size of every page size) in > comparison to flushing after each page entry update. So we prefer to > use iotlb_sync_range() to replace iotlb_sync(), tlb_add_range() and > tlb_flush_walk/leaf() for MTK platforms. In the case of mapping, it sounds like what you actually want to do is hook up .iotlb_sync_map and generally make IO_PGTABLE_QUIRK_TLBI_ON_MAP cleverer, because the current implementation is as dumb as it could possibly be. In fact if we simply passed an address range to .iotlb_sync_map, io-pgtable probably wouldn't need to be involved at all any more. Robin. > Signed-off-by: Chao Hao <chao.hao@mediatek.com> > --- > drivers/iommu/mtk_iommu.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c > index 785b228d39a6..d3400c15ff7b 100644 > --- a/drivers/iommu/mtk_iommu.c > +++ b/drivers/iommu/mtk_iommu.c > @@ -224,6 +224,11 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size, > } > } > > +static void __mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size) > +{ > + mtk_iommu_tlb_flush_range_sync(iova, size, 0, NULL) > +} > + > static void mtk_iommu_tlb_flush_page_nosync(struct iommu_iotlb_gather *gather, > unsigned long iova, size_t granule, > void *cookie) > @@ -536,6 +541,7 @@ static const struct iommu_ops mtk_iommu_ops = { > .map = mtk_iommu_map, > .unmap = mtk_iommu_unmap, > .flush_iotlb_all = mtk_iommu_flush_iotlb_all, > + .iotlb_sync_range = __mtk_iommu_tlb_flush_range_sync, > .iotlb_sync = mtk_iommu_iotlb_sync, > .iova_to_phys = mtk_iommu_iova_to_phys, > .probe_device = mtk_iommu_probe_device, > _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 2/4] iommu/mediatek: Add iotlb_sync_range() support @ 2020-10-21 16:55 ` Robin Murphy 0 siblings, 0 replies; 36+ messages in thread From: Robin Murphy @ 2020-10-21 16:55 UTC (permalink / raw) To: Chao Hao, Joerg Roedel, Matthias Brugger Cc: Jun Wen, FY Yang, wsd_upstream, linux-kernel, iommu, linux-mediatek, linux-arm-kernel, Mingyuan Ma On 2020-10-19 12:30, Chao Hao wrote: > MTK_IOMMU driver writes one page entry and does tlb flush at a time > currently. More optimal would be to aggregate the writes and flush > BUS buffer in the end. That's exactly what iommu_iotlb_gather_add_page() is meant to achieve. Rather than jumping straight into hacking up a new API to go round the back of the existing API design, it would be far better to ask the question of why that's not behaving as expected. > For 50MB buffer mapping, if mtk_iommu driver use iotlb_sync_range() > instead of tlb_add_range() and tlb_flush_walk/leaf(), it can increase > 50% performance or more(depending on size of every page size) in > comparison to flushing after each page entry update. So we prefer to > use iotlb_sync_range() to replace iotlb_sync(), tlb_add_range() and > tlb_flush_walk/leaf() for MTK platforms. In the case of mapping, it sounds like what you actually want to do is hook up .iotlb_sync_map and generally make IO_PGTABLE_QUIRK_TLBI_ON_MAP cleverer, because the current implementation is as dumb as it could possibly be. In fact if we simply passed an address range to .iotlb_sync_map, io-pgtable probably wouldn't need to be involved at all any more. Robin. > Signed-off-by: Chao Hao <chao.hao@mediatek.com> > --- > drivers/iommu/mtk_iommu.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c > index 785b228d39a6..d3400c15ff7b 100644 > --- a/drivers/iommu/mtk_iommu.c > +++ b/drivers/iommu/mtk_iommu.c > @@ -224,6 +224,11 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size, > } > } > > +static void __mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size) > +{ > + mtk_iommu_tlb_flush_range_sync(iova, size, 0, NULL) > +} > + > static void mtk_iommu_tlb_flush_page_nosync(struct iommu_iotlb_gather *gather, > unsigned long iova, size_t granule, > void *cookie) > @@ -536,6 +541,7 @@ static const struct iommu_ops mtk_iommu_ops = { > .map = mtk_iommu_map, > .unmap = mtk_iommu_unmap, > .flush_iotlb_all = mtk_iommu_flush_iotlb_all, > + .iotlb_sync_range = __mtk_iommu_tlb_flush_range_sync, > .iotlb_sync = mtk_iommu_iotlb_sync, > .iova_to_phys = mtk_iommu_iova_to_phys, > .probe_device = mtk_iommu_probe_device, > ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 2/4] iommu/mediatek: Add iotlb_sync_range() support @ 2020-10-21 16:55 ` Robin Murphy 0 siblings, 0 replies; 36+ messages in thread From: Robin Murphy @ 2020-10-21 16:55 UTC (permalink / raw) To: Chao Hao, Joerg Roedel, Matthias Brugger Cc: Jun Wen, FY Yang, wsd_upstream, linux-kernel, iommu, linux-mediatek, Mingyuan Ma, linux-arm-kernel On 2020-10-19 12:30, Chao Hao wrote: > MTK_IOMMU driver writes one page entry and does tlb flush at a time > currently. More optimal would be to aggregate the writes and flush > BUS buffer in the end. That's exactly what iommu_iotlb_gather_add_page() is meant to achieve. Rather than jumping straight into hacking up a new API to go round the back of the existing API design, it would be far better to ask the question of why that's not behaving as expected. > For 50MB buffer mapping, if mtk_iommu driver use iotlb_sync_range() > instead of tlb_add_range() and tlb_flush_walk/leaf(), it can increase > 50% performance or more(depending on size of every page size) in > comparison to flushing after each page entry update. So we prefer to > use iotlb_sync_range() to replace iotlb_sync(), tlb_add_range() and > tlb_flush_walk/leaf() for MTK platforms. In the case of mapping, it sounds like what you actually want to do is hook up .iotlb_sync_map and generally make IO_PGTABLE_QUIRK_TLBI_ON_MAP cleverer, because the current implementation is as dumb as it could possibly be. In fact if we simply passed an address range to .iotlb_sync_map, io-pgtable probably wouldn't need to be involved at all any more. Robin. > Signed-off-by: Chao Hao <chao.hao@mediatek.com> > --- > drivers/iommu/mtk_iommu.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c > index 785b228d39a6..d3400c15ff7b 100644 > --- a/drivers/iommu/mtk_iommu.c > +++ b/drivers/iommu/mtk_iommu.c > @@ -224,6 +224,11 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size, > } > } > > +static void __mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size) > +{ > + mtk_iommu_tlb_flush_range_sync(iova, size, 0, NULL) > +} > + > static void mtk_iommu_tlb_flush_page_nosync(struct iommu_iotlb_gather *gather, > unsigned long iova, size_t granule, > void *cookie) > @@ -536,6 +541,7 @@ static const struct iommu_ops mtk_iommu_ops = { > .map = mtk_iommu_map, > .unmap = mtk_iommu_unmap, > .flush_iotlb_all = mtk_iommu_flush_iotlb_all, > + .iotlb_sync_range = __mtk_iommu_tlb_flush_range_sync, > .iotlb_sync = mtk_iommu_iotlb_sync, > .iova_to_phys = mtk_iommu_iova_to_phys, > .probe_device = mtk_iommu_probe_device, > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 2/4] iommu/mediatek: Add iotlb_sync_range() support @ 2020-10-21 16:55 ` Robin Murphy 0 siblings, 0 replies; 36+ messages in thread From: Robin Murphy @ 2020-10-21 16:55 UTC (permalink / raw) To: Chao Hao, Joerg Roedel, Matthias Brugger Cc: Jun Wen, FY Yang, wsd_upstream, linux-kernel, iommu, linux-mediatek, Mingyuan Ma, linux-arm-kernel On 2020-10-19 12:30, Chao Hao wrote: > MTK_IOMMU driver writes one page entry and does tlb flush at a time > currently. More optimal would be to aggregate the writes and flush > BUS buffer in the end. That's exactly what iommu_iotlb_gather_add_page() is meant to achieve. Rather than jumping straight into hacking up a new API to go round the back of the existing API design, it would be far better to ask the question of why that's not behaving as expected. > For 50MB buffer mapping, if mtk_iommu driver use iotlb_sync_range() > instead of tlb_add_range() and tlb_flush_walk/leaf(), it can increase > 50% performance or more(depending on size of every page size) in > comparison to flushing after each page entry update. So we prefer to > use iotlb_sync_range() to replace iotlb_sync(), tlb_add_range() and > tlb_flush_walk/leaf() for MTK platforms. In the case of mapping, it sounds like what you actually want to do is hook up .iotlb_sync_map and generally make IO_PGTABLE_QUIRK_TLBI_ON_MAP cleverer, because the current implementation is as dumb as it could possibly be. In fact if we simply passed an address range to .iotlb_sync_map, io-pgtable probably wouldn't need to be involved at all any more. Robin. > Signed-off-by: Chao Hao <chao.hao@mediatek.com> > --- > drivers/iommu/mtk_iommu.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c > index 785b228d39a6..d3400c15ff7b 100644 > --- a/drivers/iommu/mtk_iommu.c > +++ b/drivers/iommu/mtk_iommu.c > @@ -224,6 +224,11 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size, > } > } > > +static void __mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size) > +{ > + mtk_iommu_tlb_flush_range_sync(iova, size, 0, NULL) > +} > + > static void mtk_iommu_tlb_flush_page_nosync(struct iommu_iotlb_gather *gather, > unsigned long iova, size_t granule, > void *cookie) > @@ -536,6 +541,7 @@ static const struct iommu_ops mtk_iommu_ops = { > .map = mtk_iommu_map, > .unmap = mtk_iommu_unmap, > .flush_iotlb_all = mtk_iommu_flush_iotlb_all, > + .iotlb_sync_range = __mtk_iommu_tlb_flush_range_sync, > .iotlb_sync = mtk_iommu_iotlb_sync, > .iova_to_phys = mtk_iommu_iova_to_phys, > .probe_device = mtk_iommu_probe_device, > _______________________________________________ Linux-mediatek mailing list Linux-mediatek@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-mediatek ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 2/4] iommu/mediatek: Add iotlb_sync_range() support 2020-10-21 16:55 ` Robin Murphy (?) (?) @ 2020-10-23 5:57 ` chao hao -1 siblings, 0 replies; 36+ messages in thread From: chao hao @ 2020-10-23 5:57 UTC (permalink / raw) To: Robin Murphy Cc: Jun Wen, FY Yang, wsd_upstream, linux-kernel, Chao Hao, iommu, linux-mediatek, Matthias Brugger, Mingyuan Ma, linux-arm-kernel On Wed, 2020-10-21 at 17:55 +0100, Robin Murphy wrote: > On 2020-10-19 12:30, Chao Hao wrote: > > MTK_IOMMU driver writes one page entry and does tlb flush at a time > > currently. More optimal would be to aggregate the writes and flush > > BUS buffer in the end. > > That's exactly what iommu_iotlb_gather_add_page() is meant to achieve. > Rather than jumping straight into hacking up a new API to go round the > back of the existing API design, it would be far better to ask the > question of why that's not behaving as expected. Thanks for you review! iommu_iotlb_gather_add_page is put in io_pgtable_tlb_add_page(). io_pgtable_tlb_add_page() only be called in unmapping and mapping flow doesn't have it in linux iommu driver, but mtk iommu needs to do tlb sync in mapping and unmapping to avoid old data being in the iommu tlb. In addtion, we hope to do tlb sync once when all the pages mapping done. iommu_iotlb_gather_add_page maybe do tlb sync more than once. because one whole buffer consists of different page size(1MB/64K/4K). Based on the previous considerations, don't find more appropriate the way of tlb sync for mtk iommu, so we add a new API. > > > For 50MB buffer mapping, if mtk_iommu driver use iotlb_sync_range() > > instead of tlb_add_range() and tlb_flush_walk/leaf(), it can increase > > 50% performance or more(depending on size of every page size) in > > comparison to flushing after each page entry update. So we prefer to > > use iotlb_sync_range() to replace iotlb_sync(), tlb_add_range() and > > tlb_flush_walk/leaf() for MTK platforms. > > In the case of mapping, it sounds like what you actually want to do is > hook up .iotlb_sync_map and generally make IO_PGTABLE_QUIRK_TLBI_ON_MAP > cleverer, because the current implementation is as dumb as it could > possibly be. iotlb_sync_map only has one parameter(iommu_domain), but mtk iommu_domain maybe include the whole iova space, if mtk_iommu to do tlb sync based on iommu_domain, it is equivalent to do tlb flush all in fact. iommu driver will do tlb sync in every mapping page when mtk iommu sets IO_PGTABLE_QUIRK_TLBI_ON_MAP(io_pgtable_tlb_flush_walk), as is the commit message mentioned, it will drop mapping performance in mtk platform. > In fact if we simply passed an address range to > .iotlb_sync_map, io-pgtable probably wouldn't need to be involved at all > any more. I know it is not a good idea probably by adding a new api, but I found out that tlb sync only to be done after mapping one page, so if mtk_iommu hope to do tlb sync once after all the pages map done, could you give me some advices? thanks! > > Robin. > > > Signed-off-by: Chao Hao <chao.hao@mediatek.com> > > --- > > drivers/iommu/mtk_iommu.c | 6 ++++++ > > 1 file changed, 6 insertions(+) > > > > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c > > index 785b228d39a6..d3400c15ff7b 100644 > > --- a/drivers/iommu/mtk_iommu.c > > +++ b/drivers/iommu/mtk_iommu.c > > @@ -224,6 +224,11 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size, > > } > > } > > > > +static void __mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size) > > +{ > > + mtk_iommu_tlb_flush_range_sync(iova, size, 0, NULL) > > +} > > + > > static void mtk_iommu_tlb_flush_page_nosync(struct iommu_iotlb_gather *gather, > > unsigned long iova, size_t granule, > > void *cookie) > > @@ -536,6 +541,7 @@ static const struct iommu_ops mtk_iommu_ops = { > > .map = mtk_iommu_map, > > .unmap = mtk_iommu_unmap, > > .flush_iotlb_all = mtk_iommu_flush_iotlb_all, > > + .iotlb_sync_range = __mtk_iommu_tlb_flush_range_sync, > > .iotlb_sync = mtk_iommu_iotlb_sync, > > .iova_to_phys = mtk_iommu_iova_to_phys, > > .probe_device = mtk_iommu_probe_device, > > _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 2/4] iommu/mediatek: Add iotlb_sync_range() support @ 2020-10-23 5:57 ` chao hao 0 siblings, 0 replies; 36+ messages in thread From: chao hao @ 2020-10-23 5:57 UTC (permalink / raw) To: Robin Murphy Cc: Joerg Roedel, Matthias Brugger, Jun Wen, FY Yang, wsd_upstream, linux-kernel, iommu, linux-mediatek, linux-arm-kernel, Mingyuan Ma, Chao Hao On Wed, 2020-10-21 at 17:55 +0100, Robin Murphy wrote: > On 2020-10-19 12:30, Chao Hao wrote: > > MTK_IOMMU driver writes one page entry and does tlb flush at a time > > currently. More optimal would be to aggregate the writes and flush > > BUS buffer in the end. > > That's exactly what iommu_iotlb_gather_add_page() is meant to achieve. > Rather than jumping straight into hacking up a new API to go round the > back of the existing API design, it would be far better to ask the > question of why that's not behaving as expected. Thanks for you review! iommu_iotlb_gather_add_page is put in io_pgtable_tlb_add_page(). io_pgtable_tlb_add_page() only be called in unmapping and mapping flow doesn't have it in linux iommu driver, but mtk iommu needs to do tlb sync in mapping and unmapping to avoid old data being in the iommu tlb. In addtion, we hope to do tlb sync once when all the pages mapping done. iommu_iotlb_gather_add_page maybe do tlb sync more than once. because one whole buffer consists of different page size(1MB/64K/4K). Based on the previous considerations, don't find more appropriate the way of tlb sync for mtk iommu, so we add a new API. > > > For 50MB buffer mapping, if mtk_iommu driver use iotlb_sync_range() > > instead of tlb_add_range() and tlb_flush_walk/leaf(), it can increase > > 50% performance or more(depending on size of every page size) in > > comparison to flushing after each page entry update. So we prefer to > > use iotlb_sync_range() to replace iotlb_sync(), tlb_add_range() and > > tlb_flush_walk/leaf() for MTK platforms. > > In the case of mapping, it sounds like what you actually want to do is > hook up .iotlb_sync_map and generally make IO_PGTABLE_QUIRK_TLBI_ON_MAP > cleverer, because the current implementation is as dumb as it could > possibly be. iotlb_sync_map only has one parameter(iommu_domain), but mtk iommu_domain maybe include the whole iova space, if mtk_iommu to do tlb sync based on iommu_domain, it is equivalent to do tlb flush all in fact. iommu driver will do tlb sync in every mapping page when mtk iommu sets IO_PGTABLE_QUIRK_TLBI_ON_MAP(io_pgtable_tlb_flush_walk), as is the commit message mentioned, it will drop mapping performance in mtk platform. > In fact if we simply passed an address range to > .iotlb_sync_map, io-pgtable probably wouldn't need to be involved at all > any more. I know it is not a good idea probably by adding a new api, but I found out that tlb sync only to be done after mapping one page, so if mtk_iommu hope to do tlb sync once after all the pages map done, could you give me some advices? thanks! > > Robin. > > > Signed-off-by: Chao Hao <chao.hao@mediatek.com> > > --- > > drivers/iommu/mtk_iommu.c | 6 ++++++ > > 1 file changed, 6 insertions(+) > > > > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c > > index 785b228d39a6..d3400c15ff7b 100644 > > --- a/drivers/iommu/mtk_iommu.c > > +++ b/drivers/iommu/mtk_iommu.c > > @@ -224,6 +224,11 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size, > > } > > } > > > > +static void __mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size) > > +{ > > + mtk_iommu_tlb_flush_range_sync(iova, size, 0, NULL) > > +} > > + > > static void mtk_iommu_tlb_flush_page_nosync(struct iommu_iotlb_gather *gather, > > unsigned long iova, size_t granule, > > void *cookie) > > @@ -536,6 +541,7 @@ static const struct iommu_ops mtk_iommu_ops = { > > .map = mtk_iommu_map, > > .unmap = mtk_iommu_unmap, > > .flush_iotlb_all = mtk_iommu_flush_iotlb_all, > > + .iotlb_sync_range = __mtk_iommu_tlb_flush_range_sync, > > .iotlb_sync = mtk_iommu_iotlb_sync, > > .iova_to_phys = mtk_iommu_iova_to_phys, > > .probe_device = mtk_iommu_probe_device, > > ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 2/4] iommu/mediatek: Add iotlb_sync_range() support @ 2020-10-23 5:57 ` chao hao 0 siblings, 0 replies; 36+ messages in thread From: chao hao @ 2020-10-23 5:57 UTC (permalink / raw) To: Robin Murphy Cc: Jun Wen, FY Yang, wsd_upstream, Joerg Roedel, linux-kernel, Chao Hao, iommu, linux-mediatek, Matthias Brugger, Mingyuan Ma, linux-arm-kernel On Wed, 2020-10-21 at 17:55 +0100, Robin Murphy wrote: > On 2020-10-19 12:30, Chao Hao wrote: > > MTK_IOMMU driver writes one page entry and does tlb flush at a time > > currently. More optimal would be to aggregate the writes and flush > > BUS buffer in the end. > > That's exactly what iommu_iotlb_gather_add_page() is meant to achieve. > Rather than jumping straight into hacking up a new API to go round the > back of the existing API design, it would be far better to ask the > question of why that's not behaving as expected. Thanks for you review! iommu_iotlb_gather_add_page is put in io_pgtable_tlb_add_page(). io_pgtable_tlb_add_page() only be called in unmapping and mapping flow doesn't have it in linux iommu driver, but mtk iommu needs to do tlb sync in mapping and unmapping to avoid old data being in the iommu tlb. In addtion, we hope to do tlb sync once when all the pages mapping done. iommu_iotlb_gather_add_page maybe do tlb sync more than once. because one whole buffer consists of different page size(1MB/64K/4K). Based on the previous considerations, don't find more appropriate the way of tlb sync for mtk iommu, so we add a new API. > > > For 50MB buffer mapping, if mtk_iommu driver use iotlb_sync_range() > > instead of tlb_add_range() and tlb_flush_walk/leaf(), it can increase > > 50% performance or more(depending on size of every page size) in > > comparison to flushing after each page entry update. So we prefer to > > use iotlb_sync_range() to replace iotlb_sync(), tlb_add_range() and > > tlb_flush_walk/leaf() for MTK platforms. > > In the case of mapping, it sounds like what you actually want to do is > hook up .iotlb_sync_map and generally make IO_PGTABLE_QUIRK_TLBI_ON_MAP > cleverer, because the current implementation is as dumb as it could > possibly be. iotlb_sync_map only has one parameter(iommu_domain), but mtk iommu_domain maybe include the whole iova space, if mtk_iommu to do tlb sync based on iommu_domain, it is equivalent to do tlb flush all in fact. iommu driver will do tlb sync in every mapping page when mtk iommu sets IO_PGTABLE_QUIRK_TLBI_ON_MAP(io_pgtable_tlb_flush_walk), as is the commit message mentioned, it will drop mapping performance in mtk platform. > In fact if we simply passed an address range to > .iotlb_sync_map, io-pgtable probably wouldn't need to be involved at all > any more. I know it is not a good idea probably by adding a new api, but I found out that tlb sync only to be done after mapping one page, so if mtk_iommu hope to do tlb sync once after all the pages map done, could you give me some advices? thanks! > > Robin. > > > Signed-off-by: Chao Hao <chao.hao@mediatek.com> > > --- > > drivers/iommu/mtk_iommu.c | 6 ++++++ > > 1 file changed, 6 insertions(+) > > > > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c > > index 785b228d39a6..d3400c15ff7b 100644 > > --- a/drivers/iommu/mtk_iommu.c > > +++ b/drivers/iommu/mtk_iommu.c > > @@ -224,6 +224,11 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size, > > } > > } > > > > +static void __mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size) > > +{ > > + mtk_iommu_tlb_flush_range_sync(iova, size, 0, NULL) > > +} > > + > > static void mtk_iommu_tlb_flush_page_nosync(struct iommu_iotlb_gather *gather, > > unsigned long iova, size_t granule, > > void *cookie) > > @@ -536,6 +541,7 @@ static const struct iommu_ops mtk_iommu_ops = { > > .map = mtk_iommu_map, > > .unmap = mtk_iommu_unmap, > > .flush_iotlb_all = mtk_iommu_flush_iotlb_all, > > + .iotlb_sync_range = __mtk_iommu_tlb_flush_range_sync, > > .iotlb_sync = mtk_iommu_iotlb_sync, > > .iova_to_phys = mtk_iommu_iova_to_phys, > > .probe_device = mtk_iommu_probe_device, > > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 2/4] iommu/mediatek: Add iotlb_sync_range() support @ 2020-10-23 5:57 ` chao hao 0 siblings, 0 replies; 36+ messages in thread From: chao hao @ 2020-10-23 5:57 UTC (permalink / raw) To: Robin Murphy Cc: Jun Wen, FY Yang, wsd_upstream, Joerg Roedel, linux-kernel, Chao Hao, iommu, linux-mediatek, Matthias Brugger, Mingyuan Ma, linux-arm-kernel On Wed, 2020-10-21 at 17:55 +0100, Robin Murphy wrote: > On 2020-10-19 12:30, Chao Hao wrote: > > MTK_IOMMU driver writes one page entry and does tlb flush at a time > > currently. More optimal would be to aggregate the writes and flush > > BUS buffer in the end. > > That's exactly what iommu_iotlb_gather_add_page() is meant to achieve. > Rather than jumping straight into hacking up a new API to go round the > back of the existing API design, it would be far better to ask the > question of why that's not behaving as expected. Thanks for you review! iommu_iotlb_gather_add_page is put in io_pgtable_tlb_add_page(). io_pgtable_tlb_add_page() only be called in unmapping and mapping flow doesn't have it in linux iommu driver, but mtk iommu needs to do tlb sync in mapping and unmapping to avoid old data being in the iommu tlb. In addtion, we hope to do tlb sync once when all the pages mapping done. iommu_iotlb_gather_add_page maybe do tlb sync more than once. because one whole buffer consists of different page size(1MB/64K/4K). Based on the previous considerations, don't find more appropriate the way of tlb sync for mtk iommu, so we add a new API. > > > For 50MB buffer mapping, if mtk_iommu driver use iotlb_sync_range() > > instead of tlb_add_range() and tlb_flush_walk/leaf(), it can increase > > 50% performance or more(depending on size of every page size) in > > comparison to flushing after each page entry update. So we prefer to > > use iotlb_sync_range() to replace iotlb_sync(), tlb_add_range() and > > tlb_flush_walk/leaf() for MTK platforms. > > In the case of mapping, it sounds like what you actually want to do is > hook up .iotlb_sync_map and generally make IO_PGTABLE_QUIRK_TLBI_ON_MAP > cleverer, because the current implementation is as dumb as it could > possibly be. iotlb_sync_map only has one parameter(iommu_domain), but mtk iommu_domain maybe include the whole iova space, if mtk_iommu to do tlb sync based on iommu_domain, it is equivalent to do tlb flush all in fact. iommu driver will do tlb sync in every mapping page when mtk iommu sets IO_PGTABLE_QUIRK_TLBI_ON_MAP(io_pgtable_tlb_flush_walk), as is the commit message mentioned, it will drop mapping performance in mtk platform. > In fact if we simply passed an address range to > .iotlb_sync_map, io-pgtable probably wouldn't need to be involved at all > any more. I know it is not a good idea probably by adding a new api, but I found out that tlb sync only to be done after mapping one page, so if mtk_iommu hope to do tlb sync once after all the pages map done, could you give me some advices? thanks! > > Robin. > > > Signed-off-by: Chao Hao <chao.hao@mediatek.com> > > --- > > drivers/iommu/mtk_iommu.c | 6 ++++++ > > 1 file changed, 6 insertions(+) > > > > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c > > index 785b228d39a6..d3400c15ff7b 100644 > > --- a/drivers/iommu/mtk_iommu.c > > +++ b/drivers/iommu/mtk_iommu.c > > @@ -224,6 +224,11 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size, > > } > > } > > > > +static void __mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size) > > +{ > > + mtk_iommu_tlb_flush_range_sync(iova, size, 0, NULL) > > +} > > + > > static void mtk_iommu_tlb_flush_page_nosync(struct iommu_iotlb_gather *gather, > > unsigned long iova, size_t granule, > > void *cookie) > > @@ -536,6 +541,7 @@ static const struct iommu_ops mtk_iommu_ops = { > > .map = mtk_iommu_map, > > .unmap = mtk_iommu_unmap, > > .flush_iotlb_all = mtk_iommu_flush_iotlb_all, > > + .iotlb_sync_range = __mtk_iommu_tlb_flush_range_sync, > > .iotlb_sync = mtk_iommu_iotlb_sync, > > .iova_to_phys = mtk_iommu_iova_to_phys, > > .probe_device = mtk_iommu_probe_device, > > _______________________________________________ Linux-mediatek mailing list Linux-mediatek@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-mediatek ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 2/4] iommu/mediatek: Add iotlb_sync_range() support 2020-10-23 5:57 ` chao hao (?) (?) @ 2020-10-23 6:04 ` chao hao -1 siblings, 0 replies; 36+ messages in thread From: chao hao @ 2020-10-23 6:04 UTC (permalink / raw) To: Robin Murphy Cc: Jun Wen, FY Yang, wsd_upstream, linux-kernel, Chao Hao, iommu, linux-mediatek, Matthias Brugger, Mingyuan Ma, linux-arm-kernel On Fri, 2020-10-23 at 13:57 +0800, chao hao wrote: > On Wed, 2020-10-21 at 17:55 +0100, Robin Murphy wrote: > > On 2020-10-19 12:30, Chao Hao wrote: > > > MTK_IOMMU driver writes one page entry and does tlb flush at a time > > > currently. More optimal would be to aggregate the writes and flush > > > BUS buffer in the end. > > > > That's exactly what iommu_iotlb_gather_add_page() is meant to achieve. > > Rather than jumping straight into hacking up a new API to go round the > > back of the existing API design, it would be far better to ask the > > question of why that's not behaving as expected. > > Thanks for you review! > > iommu_iotlb_gather_add_page is put in io_pgtable_tlb_add_page(). > io_pgtable_tlb_add_page() only be called in > unmapping and mapping flow doesn't have it in linux iommu driver, but > mtk iommu needs to do tlb sync in mapping > and unmapping to avoid old data being in the iommu tlb. > > In addtion, we hope to do tlb sync once when all the pages mapping done. > iommu_iotlb_gather_add_page maybe do > tlb sync more than once. because one whole buffer consists of different > page size(1MB/64K/4K). > > Based on the previous considerations, don't find more appropriate the > way of tlb sync for mtk iommu, so we add a new API. > > > > > > For 50MB buffer mapping, if mtk_iommu driver use iotlb_sync_range() > > > instead of tlb_add_range() and tlb_flush_walk/leaf(), it can increase > > > 50% performance or more(depending on size of every page size) in > > > comparison to flushing after each page entry update. So we prefer to > > > use iotlb_sync_range() to replace iotlb_sync(), tlb_add_range() and > > > tlb_flush_walk/leaf() for MTK platforms. > > > > In the case of mapping, it sounds like what you actually want to do is > > hook up .iotlb_sync_map and generally make IO_PGTABLE_QUIRK_TLBI_ON_MAP > > cleverer, because the current implementation is as dumb as it could > > possibly be. > > iotlb_sync_map only has one parameter(iommu_domain), but mtk > iommu_domain maybe include the whole iova space, if mtk_iommu to do tlb > sync based on iommu_domain, it is equivalent to do tlb flush all in > fact. > iommu driver will do tlb sync in every mapping page when mtk iommu sets > IO_PGTABLE_QUIRK_TLBI_ON_MAP(io_pgtable_tlb_flush_walk), > as is the commit message mentioned, it will drop mapping performance in > mtk platform. > > > > In fact if we simply passed an address range to > > .iotlb_sync_map, io-pgtable probably wouldn't need to be involved at all > > any more. Sorry, I forget to reply the question in previous mail. Do you mean we need to modify iotlb_sync_map() input parameter(ex: add start/end iova)? > > I know it is not a good idea probably by adding a new api, but I found > out that tlb sync only to be done after mapping one page, so if > mtk_iommu hope to do tlb sync once after all the pages map done, could > you give me some advices? thanks! > > > > > Robin. > > > > > Signed-off-by: Chao Hao <chao.hao@mediatek.com> > > > --- > > > drivers/iommu/mtk_iommu.c | 6 ++++++ > > > 1 file changed, 6 insertions(+) > > > > > > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c > > > index 785b228d39a6..d3400c15ff7b 100644 > > > --- a/drivers/iommu/mtk_iommu.c > > > +++ b/drivers/iommu/mtk_iommu.c > > > @@ -224,6 +224,11 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size, > > > } > > > } > > > > > > +static void __mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size) > > > +{ > > > + mtk_iommu_tlb_flush_range_sync(iova, size, 0, NULL) > > > +} > > > + > > > static void mtk_iommu_tlb_flush_page_nosync(struct iommu_iotlb_gather *gather, > > > unsigned long iova, size_t granule, > > > void *cookie) > > > @@ -536,6 +541,7 @@ static const struct iommu_ops mtk_iommu_ops = { > > > .map = mtk_iommu_map, > > > .unmap = mtk_iommu_unmap, > > > .flush_iotlb_all = mtk_iommu_flush_iotlb_all, > > > + .iotlb_sync_range = __mtk_iommu_tlb_flush_range_sync, > > > .iotlb_sync = mtk_iommu_iotlb_sync, > > > .iova_to_phys = mtk_iommu_iova_to_phys, > > > .probe_device = mtk_iommu_probe_device, > > > > _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 2/4] iommu/mediatek: Add iotlb_sync_range() support @ 2020-10-23 6:04 ` chao hao 0 siblings, 0 replies; 36+ messages in thread From: chao hao @ 2020-10-23 6:04 UTC (permalink / raw) To: Robin Murphy Cc: Joerg Roedel, Matthias Brugger, Jun Wen, FY Yang, wsd_upstream, linux-kernel, iommu, linux-mediatek, linux-arm-kernel, Mingyuan Ma, Chao Hao On Fri, 2020-10-23 at 13:57 +0800, chao hao wrote: > On Wed, 2020-10-21 at 17:55 +0100, Robin Murphy wrote: > > On 2020-10-19 12:30, Chao Hao wrote: > > > MTK_IOMMU driver writes one page entry and does tlb flush at a time > > > currently. More optimal would be to aggregate the writes and flush > > > BUS buffer in the end. > > > > That's exactly what iommu_iotlb_gather_add_page() is meant to achieve. > > Rather than jumping straight into hacking up a new API to go round the > > back of the existing API design, it would be far better to ask the > > question of why that's not behaving as expected. > > Thanks for you review! > > iommu_iotlb_gather_add_page is put in io_pgtable_tlb_add_page(). > io_pgtable_tlb_add_page() only be called in > unmapping and mapping flow doesn't have it in linux iommu driver, but > mtk iommu needs to do tlb sync in mapping > and unmapping to avoid old data being in the iommu tlb. > > In addtion, we hope to do tlb sync once when all the pages mapping done. > iommu_iotlb_gather_add_page maybe do > tlb sync more than once. because one whole buffer consists of different > page size(1MB/64K/4K). > > Based on the previous considerations, don't find more appropriate the > way of tlb sync for mtk iommu, so we add a new API. > > > > > > For 50MB buffer mapping, if mtk_iommu driver use iotlb_sync_range() > > > instead of tlb_add_range() and tlb_flush_walk/leaf(), it can increase > > > 50% performance or more(depending on size of every page size) in > > > comparison to flushing after each page entry update. So we prefer to > > > use iotlb_sync_range() to replace iotlb_sync(), tlb_add_range() and > > > tlb_flush_walk/leaf() for MTK platforms. > > > > In the case of mapping, it sounds like what you actually want to do is > > hook up .iotlb_sync_map and generally make IO_PGTABLE_QUIRK_TLBI_ON_MAP > > cleverer, because the current implementation is as dumb as it could > > possibly be. > > iotlb_sync_map only has one parameter(iommu_domain), but mtk > iommu_domain maybe include the whole iova space, if mtk_iommu to do tlb > sync based on iommu_domain, it is equivalent to do tlb flush all in > fact. > iommu driver will do tlb sync in every mapping page when mtk iommu sets > IO_PGTABLE_QUIRK_TLBI_ON_MAP(io_pgtable_tlb_flush_walk), > as is the commit message mentioned, it will drop mapping performance in > mtk platform. > > > > In fact if we simply passed an address range to > > .iotlb_sync_map, io-pgtable probably wouldn't need to be involved at all > > any more. Sorry, I forget to reply the question in previous mail. Do you mean we need to modify iotlb_sync_map() input parameter(ex: add start/end iova)? > > I know it is not a good idea probably by adding a new api, but I found > out that tlb sync only to be done after mapping one page, so if > mtk_iommu hope to do tlb sync once after all the pages map done, could > you give me some advices? thanks! > > > > > Robin. > > > > > Signed-off-by: Chao Hao <chao.hao@mediatek.com> > > > --- > > > drivers/iommu/mtk_iommu.c | 6 ++++++ > > > 1 file changed, 6 insertions(+) > > > > > > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c > > > index 785b228d39a6..d3400c15ff7b 100644 > > > --- a/drivers/iommu/mtk_iommu.c > > > +++ b/drivers/iommu/mtk_iommu.c > > > @@ -224,6 +224,11 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size, > > > } > > > } > > > > > > +static void __mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size) > > > +{ > > > + mtk_iommu_tlb_flush_range_sync(iova, size, 0, NULL) > > > +} > > > + > > > static void mtk_iommu_tlb_flush_page_nosync(struct iommu_iotlb_gather *gather, > > > unsigned long iova, size_t granule, > > > void *cookie) > > > @@ -536,6 +541,7 @@ static const struct iommu_ops mtk_iommu_ops = { > > > .map = mtk_iommu_map, > > > .unmap = mtk_iommu_unmap, > > > .flush_iotlb_all = mtk_iommu_flush_iotlb_all, > > > + .iotlb_sync_range = __mtk_iommu_tlb_flush_range_sync, > > > .iotlb_sync = mtk_iommu_iotlb_sync, > > > .iova_to_phys = mtk_iommu_iova_to_phys, > > > .probe_device = mtk_iommu_probe_device, > > > > ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 2/4] iommu/mediatek: Add iotlb_sync_range() support @ 2020-10-23 6:04 ` chao hao 0 siblings, 0 replies; 36+ messages in thread From: chao hao @ 2020-10-23 6:04 UTC (permalink / raw) To: Robin Murphy Cc: Jun Wen, FY Yang, wsd_upstream, Joerg Roedel, linux-kernel, Chao Hao, iommu, linux-mediatek, Matthias Brugger, Mingyuan Ma, linux-arm-kernel On Fri, 2020-10-23 at 13:57 +0800, chao hao wrote: > On Wed, 2020-10-21 at 17:55 +0100, Robin Murphy wrote: > > On 2020-10-19 12:30, Chao Hao wrote: > > > MTK_IOMMU driver writes one page entry and does tlb flush at a time > > > currently. More optimal would be to aggregate the writes and flush > > > BUS buffer in the end. > > > > That's exactly what iommu_iotlb_gather_add_page() is meant to achieve. > > Rather than jumping straight into hacking up a new API to go round the > > back of the existing API design, it would be far better to ask the > > question of why that's not behaving as expected. > > Thanks for you review! > > iommu_iotlb_gather_add_page is put in io_pgtable_tlb_add_page(). > io_pgtable_tlb_add_page() only be called in > unmapping and mapping flow doesn't have it in linux iommu driver, but > mtk iommu needs to do tlb sync in mapping > and unmapping to avoid old data being in the iommu tlb. > > In addtion, we hope to do tlb sync once when all the pages mapping done. > iommu_iotlb_gather_add_page maybe do > tlb sync more than once. because one whole buffer consists of different > page size(1MB/64K/4K). > > Based on the previous considerations, don't find more appropriate the > way of tlb sync for mtk iommu, so we add a new API. > > > > > > For 50MB buffer mapping, if mtk_iommu driver use iotlb_sync_range() > > > instead of tlb_add_range() and tlb_flush_walk/leaf(), it can increase > > > 50% performance or more(depending on size of every page size) in > > > comparison to flushing after each page entry update. So we prefer to > > > use iotlb_sync_range() to replace iotlb_sync(), tlb_add_range() and > > > tlb_flush_walk/leaf() for MTK platforms. > > > > In the case of mapping, it sounds like what you actually want to do is > > hook up .iotlb_sync_map and generally make IO_PGTABLE_QUIRK_TLBI_ON_MAP > > cleverer, because the current implementation is as dumb as it could > > possibly be. > > iotlb_sync_map only has one parameter(iommu_domain), but mtk > iommu_domain maybe include the whole iova space, if mtk_iommu to do tlb > sync based on iommu_domain, it is equivalent to do tlb flush all in > fact. > iommu driver will do tlb sync in every mapping page when mtk iommu sets > IO_PGTABLE_QUIRK_TLBI_ON_MAP(io_pgtable_tlb_flush_walk), > as is the commit message mentioned, it will drop mapping performance in > mtk platform. > > > > In fact if we simply passed an address range to > > .iotlb_sync_map, io-pgtable probably wouldn't need to be involved at all > > any more. Sorry, I forget to reply the question in previous mail. Do you mean we need to modify iotlb_sync_map() input parameter(ex: add start/end iova)? > > I know it is not a good idea probably by adding a new api, but I found > out that tlb sync only to be done after mapping one page, so if > mtk_iommu hope to do tlb sync once after all the pages map done, could > you give me some advices? thanks! > > > > > Robin. > > > > > Signed-off-by: Chao Hao <chao.hao@mediatek.com> > > > --- > > > drivers/iommu/mtk_iommu.c | 6 ++++++ > > > 1 file changed, 6 insertions(+) > > > > > > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c > > > index 785b228d39a6..d3400c15ff7b 100644 > > > --- a/drivers/iommu/mtk_iommu.c > > > +++ b/drivers/iommu/mtk_iommu.c > > > @@ -224,6 +224,11 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size, > > > } > > > } > > > > > > +static void __mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size) > > > +{ > > > + mtk_iommu_tlb_flush_range_sync(iova, size, 0, NULL) > > > +} > > > + > > > static void mtk_iommu_tlb_flush_page_nosync(struct iommu_iotlb_gather *gather, > > > unsigned long iova, size_t granule, > > > void *cookie) > > > @@ -536,6 +541,7 @@ static const struct iommu_ops mtk_iommu_ops = { > > > .map = mtk_iommu_map, > > > .unmap = mtk_iommu_unmap, > > > .flush_iotlb_all = mtk_iommu_flush_iotlb_all, > > > + .iotlb_sync_range = __mtk_iommu_tlb_flush_range_sync, > > > .iotlb_sync = mtk_iommu_iotlb_sync, > > > .iova_to_phys = mtk_iommu_iova_to_phys, > > > .probe_device = mtk_iommu_probe_device, > > > > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 2/4] iommu/mediatek: Add iotlb_sync_range() support @ 2020-10-23 6:04 ` chao hao 0 siblings, 0 replies; 36+ messages in thread From: chao hao @ 2020-10-23 6:04 UTC (permalink / raw) To: Robin Murphy Cc: Jun Wen, FY Yang, wsd_upstream, Joerg Roedel, linux-kernel, Chao Hao, iommu, linux-mediatek, Matthias Brugger, Mingyuan Ma, linux-arm-kernel On Fri, 2020-10-23 at 13:57 +0800, chao hao wrote: > On Wed, 2020-10-21 at 17:55 +0100, Robin Murphy wrote: > > On 2020-10-19 12:30, Chao Hao wrote: > > > MTK_IOMMU driver writes one page entry and does tlb flush at a time > > > currently. More optimal would be to aggregate the writes and flush > > > BUS buffer in the end. > > > > That's exactly what iommu_iotlb_gather_add_page() is meant to achieve. > > Rather than jumping straight into hacking up a new API to go round the > > back of the existing API design, it would be far better to ask the > > question of why that's not behaving as expected. > > Thanks for you review! > > iommu_iotlb_gather_add_page is put in io_pgtable_tlb_add_page(). > io_pgtable_tlb_add_page() only be called in > unmapping and mapping flow doesn't have it in linux iommu driver, but > mtk iommu needs to do tlb sync in mapping > and unmapping to avoid old data being in the iommu tlb. > > In addtion, we hope to do tlb sync once when all the pages mapping done. > iommu_iotlb_gather_add_page maybe do > tlb sync more than once. because one whole buffer consists of different > page size(1MB/64K/4K). > > Based on the previous considerations, don't find more appropriate the > way of tlb sync for mtk iommu, so we add a new API. > > > > > > For 50MB buffer mapping, if mtk_iommu driver use iotlb_sync_range() > > > instead of tlb_add_range() and tlb_flush_walk/leaf(), it can increase > > > 50% performance or more(depending on size of every page size) in > > > comparison to flushing after each page entry update. So we prefer to > > > use iotlb_sync_range() to replace iotlb_sync(), tlb_add_range() and > > > tlb_flush_walk/leaf() for MTK platforms. > > > > In the case of mapping, it sounds like what you actually want to do is > > hook up .iotlb_sync_map and generally make IO_PGTABLE_QUIRK_TLBI_ON_MAP > > cleverer, because the current implementation is as dumb as it could > > possibly be. > > iotlb_sync_map only has one parameter(iommu_domain), but mtk > iommu_domain maybe include the whole iova space, if mtk_iommu to do tlb > sync based on iommu_domain, it is equivalent to do tlb flush all in > fact. > iommu driver will do tlb sync in every mapping page when mtk iommu sets > IO_PGTABLE_QUIRK_TLBI_ON_MAP(io_pgtable_tlb_flush_walk), > as is the commit message mentioned, it will drop mapping performance in > mtk platform. > > > > In fact if we simply passed an address range to > > .iotlb_sync_map, io-pgtable probably wouldn't need to be involved at all > > any more. Sorry, I forget to reply the question in previous mail. Do you mean we need to modify iotlb_sync_map() input parameter(ex: add start/end iova)? > > I know it is not a good idea probably by adding a new api, but I found > out that tlb sync only to be done after mapping one page, so if > mtk_iommu hope to do tlb sync once after all the pages map done, could > you give me some advices? thanks! > > > > > Robin. > > > > > Signed-off-by: Chao Hao <chao.hao@mediatek.com> > > > --- > > > drivers/iommu/mtk_iommu.c | 6 ++++++ > > > 1 file changed, 6 insertions(+) > > > > > > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c > > > index 785b228d39a6..d3400c15ff7b 100644 > > > --- a/drivers/iommu/mtk_iommu.c > > > +++ b/drivers/iommu/mtk_iommu.c > > > @@ -224,6 +224,11 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size, > > > } > > > } > > > > > > +static void __mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size) > > > +{ > > > + mtk_iommu_tlb_flush_range_sync(iova, size, 0, NULL) > > > +} > > > + > > > static void mtk_iommu_tlb_flush_page_nosync(struct iommu_iotlb_gather *gather, > > > unsigned long iova, size_t granule, > > > void *cookie) > > > @@ -536,6 +541,7 @@ static const struct iommu_ops mtk_iommu_ops = { > > > .map = mtk_iommu_map, > > > .unmap = mtk_iommu_unmap, > > > .flush_iotlb_all = mtk_iommu_flush_iotlb_all, > > > + .iotlb_sync_range = __mtk_iommu_tlb_flush_range_sync, > > > .iotlb_sync = mtk_iommu_iotlb_sync, > > > .iova_to_phys = mtk_iommu_iova_to_phys, > > > .probe_device = mtk_iommu_probe_device, > > > > _______________________________________________ Linux-mediatek mailing list Linux-mediatek@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-mediatek ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 2/4] iommu/mediatek: Add iotlb_sync_range() support 2020-10-23 5:57 ` chao hao (?) (?) @ 2020-10-23 16:07 ` Robin Murphy -1 siblings, 0 replies; 36+ messages in thread From: Robin Murphy @ 2020-10-23 16:07 UTC (permalink / raw) To: chao hao Cc: Jun Wen, FY Yang, wsd_upstream, linux-kernel, iommu, linux-mediatek, Matthias Brugger, Mingyuan Ma, linux-arm-kernel On 2020-10-23 06:57, chao hao wrote: > On Wed, 2020-10-21 at 17:55 +0100, Robin Murphy wrote: >> On 2020-10-19 12:30, Chao Hao wrote: >>> MTK_IOMMU driver writes one page entry and does tlb flush at a time >>> currently. More optimal would be to aggregate the writes and flush >>> BUS buffer in the end. >> >> That's exactly what iommu_iotlb_gather_add_page() is meant to achieve. >> Rather than jumping straight into hacking up a new API to go round the >> back of the existing API design, it would be far better to ask the >> question of why that's not behaving as expected. > > Thanks for you review! > > iommu_iotlb_gather_add_page is put in io_pgtable_tlb_add_page(). > io_pgtable_tlb_add_page() only be called in > unmapping and mapping flow doesn't have it in linux iommu driver, but > mtk iommu needs to do tlb sync in mapping > and unmapping to avoid old data being in the iommu tlb. Right, as I suspected, if it's primarily about the map path then the answer is to use the existing API intended to accommodate that specific case; if that API doesn't quite do what you need, then just fix it! It doesn't make sense to clutter up the IOMMU core with multiple overlapping APIs for TLB invalidation, especially when TLB invalidation ultimately isn't all that complicated. > In addtion, we hope to do tlb sync once when all the pages mapping done. > iommu_iotlb_gather_add_page maybe do > tlb sync more than once. because one whole buffer consists of different > page size(1MB/64K/4K). So if your hardware doesn't care about the granule size used for invalidation matching the underlying mapping, use an implementation of iommu_flush_ops::tlb_add_page that doesn't care about the granule size! > Based on the previous considerations, don't find more appropriate the > way of tlb sync for mtk iommu, so we add a new API. I know I'm probably more familiar with this code than most people, but from my perspective, this reads like "my car was dirty, so I had to buy a new car" ;) >>> For 50MB buffer mapping, if mtk_iommu driver use iotlb_sync_range() >>> instead of tlb_add_range() and tlb_flush_walk/leaf(), it can increase >>> 50% performance or more(depending on size of every page size) in >>> comparison to flushing after each page entry update. So we prefer to >>> use iotlb_sync_range() to replace iotlb_sync(), tlb_add_range() and >>> tlb_flush_walk/leaf() for MTK platforms. >> >> In the case of mapping, it sounds like what you actually want to do is >> hook up .iotlb_sync_map and generally make IO_PGTABLE_QUIRK_TLBI_ON_MAP >> cleverer, because the current implementation is as dumb as it could >> possibly be. > > iotlb_sync_map only has one parameter(iommu_domain), but mtk > iommu_domain maybe include the whole iova space, if mtk_iommu to do tlb > sync based on iommu_domain, it is equivalent to do tlb flush all in > fact. The first, and so far only, user of this API is the Tegra GART, which can only do a "flush all" operation, so the API is currently only as complex as it needs to be, which is to say "not very". There are plenty of options. > iommu driver will do tlb sync in every mapping page when mtk iommu sets > IO_PGTABLE_QUIRK_TLBI_ON_MAP(io_pgtable_tlb_flush_walk), > as is the commit message mentioned, it will drop mapping performance in > mtk platform. And as I said, that quirk is implemented in a really simplistic way, which is sure to be functionally correct, but has never been given any performance consideration. >> In fact if we simply passed an address range to >> .iotlb_sync_map, io-pgtable probably wouldn't need to be involved at all >> any more. > > I know it is not a good idea probably by adding a new api, but I found > out that tlb sync only to be done after mapping one page, so if > mtk_iommu hope to do tlb sync once after all the pages map done, could > you give me some advices? thanks! Getting rid of IO_PGTABLE_QUIRK_TLBI_ON_MAP and simply wiring .iotlb_sync_map to .flush_iotlb_all is certainly the easiest way to make .map quicker, although it will obviously have *some* impact on any other live mappings. Given that in principle this should have the least CPU overhead, then depending on TLB usage patterns there's a small chance it might actually work out as a reasonable tradeoff, so I wouldn't necessarily rule it out as a viable option without at least trying some tests. If it's cheap to *issue* invalidation commands, and the expense is in waiting for them to actually complete, then you could still fire off an invalidate for each page from .map, then wire up the sync (wait) step to .iotlb_sync_map, still without needing any core changes. Otherwise, it would seem reasonable to pass the complete address and size of the iommu_map() operation to .iotlb_sync_map, so drivers can perform their whole invalidation operation synchronously there. What I *don't* think makes sense is to try passing a gather structure through .map in the same way as for .unmap. That seems a bit too invasive for what is still a fairly exceptional case, and half the stuff that unmap operations will use the gather data for - freelists and such - won't ever be relevant to map operations, so symmetry isn't really an argument either. Robin. >>> Signed-off-by: Chao Hao <chao.hao@mediatek.com> >>> --- >>> drivers/iommu/mtk_iommu.c | 6 ++++++ >>> 1 file changed, 6 insertions(+) >>> >>> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c >>> index 785b228d39a6..d3400c15ff7b 100644 >>> --- a/drivers/iommu/mtk_iommu.c >>> +++ b/drivers/iommu/mtk_iommu.c >>> @@ -224,6 +224,11 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size, >>> } >>> } >>> >>> +static void __mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size) >>> +{ >>> + mtk_iommu_tlb_flush_range_sync(iova, size, 0, NULL) >>> +} >>> + >>> static void mtk_iommu_tlb_flush_page_nosync(struct iommu_iotlb_gather *gather, >>> unsigned long iova, size_t granule, >>> void *cookie) >>> @@ -536,6 +541,7 @@ static const struct iommu_ops mtk_iommu_ops = { >>> .map = mtk_iommu_map, >>> .unmap = mtk_iommu_unmap, >>> .flush_iotlb_all = mtk_iommu_flush_iotlb_all, >>> + .iotlb_sync_range = __mtk_iommu_tlb_flush_range_sync, >>> .iotlb_sync = mtk_iommu_iotlb_sync, >>> .iova_to_phys = mtk_iommu_iova_to_phys, >>> .probe_device = mtk_iommu_probe_device, >>> > > _______________________________________________ > iommu mailing list > iommu@lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/iommu > _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 2/4] iommu/mediatek: Add iotlb_sync_range() support @ 2020-10-23 16:07 ` Robin Murphy 0 siblings, 0 replies; 36+ messages in thread From: Robin Murphy @ 2020-10-23 16:07 UTC (permalink / raw) To: chao hao Cc: Jun Wen, FY Yang, wsd_upstream, linux-kernel, iommu, linux-mediatek, Matthias Brugger, Mingyuan Ma, linux-arm-kernel On 2020-10-23 06:57, chao hao wrote: > On Wed, 2020-10-21 at 17:55 +0100, Robin Murphy wrote: >> On 2020-10-19 12:30, Chao Hao wrote: >>> MTK_IOMMU driver writes one page entry and does tlb flush at a time >>> currently. More optimal would be to aggregate the writes and flush >>> BUS buffer in the end. >> >> That's exactly what iommu_iotlb_gather_add_page() is meant to achieve. >> Rather than jumping straight into hacking up a new API to go round the >> back of the existing API design, it would be far better to ask the >> question of why that's not behaving as expected. > > Thanks for you review! > > iommu_iotlb_gather_add_page is put in io_pgtable_tlb_add_page(). > io_pgtable_tlb_add_page() only be called in > unmapping and mapping flow doesn't have it in linux iommu driver, but > mtk iommu needs to do tlb sync in mapping > and unmapping to avoid old data being in the iommu tlb. Right, as I suspected, if it's primarily about the map path then the answer is to use the existing API intended to accommodate that specific case; if that API doesn't quite do what you need, then just fix it! It doesn't make sense to clutter up the IOMMU core with multiple overlapping APIs for TLB invalidation, especially when TLB invalidation ultimately isn't all that complicated. > In addtion, we hope to do tlb sync once when all the pages mapping done. > iommu_iotlb_gather_add_page maybe do > tlb sync more than once. because one whole buffer consists of different > page size(1MB/64K/4K). So if your hardware doesn't care about the granule size used for invalidation matching the underlying mapping, use an implementation of iommu_flush_ops::tlb_add_page that doesn't care about the granule size! > Based on the previous considerations, don't find more appropriate the > way of tlb sync for mtk iommu, so we add a new API. I know I'm probably more familiar with this code than most people, but from my perspective, this reads like "my car was dirty, so I had to buy a new car" ;) >>> For 50MB buffer mapping, if mtk_iommu driver use iotlb_sync_range() >>> instead of tlb_add_range() and tlb_flush_walk/leaf(), it can increase >>> 50% performance or more(depending on size of every page size) in >>> comparison to flushing after each page entry update. So we prefer to >>> use iotlb_sync_range() to replace iotlb_sync(), tlb_add_range() and >>> tlb_flush_walk/leaf() for MTK platforms. >> >> In the case of mapping, it sounds like what you actually want to do is >> hook up .iotlb_sync_map and generally make IO_PGTABLE_QUIRK_TLBI_ON_MAP >> cleverer, because the current implementation is as dumb as it could >> possibly be. > > iotlb_sync_map only has one parameter(iommu_domain), but mtk > iommu_domain maybe include the whole iova space, if mtk_iommu to do tlb > sync based on iommu_domain, it is equivalent to do tlb flush all in > fact. The first, and so far only, user of this API is the Tegra GART, which can only do a "flush all" operation, so the API is currently only as complex as it needs to be, which is to say "not very". There are plenty of options. > iommu driver will do tlb sync in every mapping page when mtk iommu sets > IO_PGTABLE_QUIRK_TLBI_ON_MAP(io_pgtable_tlb_flush_walk), > as is the commit message mentioned, it will drop mapping performance in > mtk platform. And as I said, that quirk is implemented in a really simplistic way, which is sure to be functionally correct, but has never been given any performance consideration. >> In fact if we simply passed an address range to >> .iotlb_sync_map, io-pgtable probably wouldn't need to be involved at all >> any more. > > I know it is not a good idea probably by adding a new api, but I found > out that tlb sync only to be done after mapping one page, so if > mtk_iommu hope to do tlb sync once after all the pages map done, could > you give me some advices? thanks! Getting rid of IO_PGTABLE_QUIRK_TLBI_ON_MAP and simply wiring .iotlb_sync_map to .flush_iotlb_all is certainly the easiest way to make .map quicker, although it will obviously have *some* impact on any other live mappings. Given that in principle this should have the least CPU overhead, then depending on TLB usage patterns there's a small chance it might actually work out as a reasonable tradeoff, so I wouldn't necessarily rule it out as a viable option without at least trying some tests. If it's cheap to *issue* invalidation commands, and the expense is in waiting for them to actually complete, then you could still fire off an invalidate for each page from .map, then wire up the sync (wait) step to .iotlb_sync_map, still without needing any core changes. Otherwise, it would seem reasonable to pass the complete address and size of the iommu_map() operation to .iotlb_sync_map, so drivers can perform their whole invalidation operation synchronously there. What I *don't* think makes sense is to try passing a gather structure through .map in the same way as for .unmap. That seems a bit too invasive for what is still a fairly exceptional case, and half the stuff that unmap operations will use the gather data for - freelists and such - won't ever be relevant to map operations, so symmetry isn't really an argument either. Robin. >>> Signed-off-by: Chao Hao <chao.hao@mediatek.com> >>> --- >>> drivers/iommu/mtk_iommu.c | 6 ++++++ >>> 1 file changed, 6 insertions(+) >>> >>> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c >>> index 785b228d39a6..d3400c15ff7b 100644 >>> --- a/drivers/iommu/mtk_iommu.c >>> +++ b/drivers/iommu/mtk_iommu.c >>> @@ -224,6 +224,11 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size, >>> } >>> } >>> >>> +static void __mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size) >>> +{ >>> + mtk_iommu_tlb_flush_range_sync(iova, size, 0, NULL) >>> +} >>> + >>> static void mtk_iommu_tlb_flush_page_nosync(struct iommu_iotlb_gather *gather, >>> unsigned long iova, size_t granule, >>> void *cookie) >>> @@ -536,6 +541,7 @@ static const struct iommu_ops mtk_iommu_ops = { >>> .map = mtk_iommu_map, >>> .unmap = mtk_iommu_unmap, >>> .flush_iotlb_all = mtk_iommu_flush_iotlb_all, >>> + .iotlb_sync_range = __mtk_iommu_tlb_flush_range_sync, >>> .iotlb_sync = mtk_iommu_iotlb_sync, >>> .iova_to_phys = mtk_iommu_iova_to_phys, >>> .probe_device = mtk_iommu_probe_device, >>> > > _______________________________________________ > iommu mailing list > iommu@lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/iommu > ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 2/4] iommu/mediatek: Add iotlb_sync_range() support @ 2020-10-23 16:07 ` Robin Murphy 0 siblings, 0 replies; 36+ messages in thread From: Robin Murphy @ 2020-10-23 16:07 UTC (permalink / raw) To: chao hao Cc: Jun Wen, FY Yang, wsd_upstream, linux-kernel, iommu, linux-mediatek, Matthias Brugger, Mingyuan Ma, linux-arm-kernel On 2020-10-23 06:57, chao hao wrote: > On Wed, 2020-10-21 at 17:55 +0100, Robin Murphy wrote: >> On 2020-10-19 12:30, Chao Hao wrote: >>> MTK_IOMMU driver writes one page entry and does tlb flush at a time >>> currently. More optimal would be to aggregate the writes and flush >>> BUS buffer in the end. >> >> That's exactly what iommu_iotlb_gather_add_page() is meant to achieve. >> Rather than jumping straight into hacking up a new API to go round the >> back of the existing API design, it would be far better to ask the >> question of why that's not behaving as expected. > > Thanks for you review! > > iommu_iotlb_gather_add_page is put in io_pgtable_tlb_add_page(). > io_pgtable_tlb_add_page() only be called in > unmapping and mapping flow doesn't have it in linux iommu driver, but > mtk iommu needs to do tlb sync in mapping > and unmapping to avoid old data being in the iommu tlb. Right, as I suspected, if it's primarily about the map path then the answer is to use the existing API intended to accommodate that specific case; if that API doesn't quite do what you need, then just fix it! It doesn't make sense to clutter up the IOMMU core with multiple overlapping APIs for TLB invalidation, especially when TLB invalidation ultimately isn't all that complicated. > In addtion, we hope to do tlb sync once when all the pages mapping done. > iommu_iotlb_gather_add_page maybe do > tlb sync more than once. because one whole buffer consists of different > page size(1MB/64K/4K). So if your hardware doesn't care about the granule size used for invalidation matching the underlying mapping, use an implementation of iommu_flush_ops::tlb_add_page that doesn't care about the granule size! > Based on the previous considerations, don't find more appropriate the > way of tlb sync for mtk iommu, so we add a new API. I know I'm probably more familiar with this code than most people, but from my perspective, this reads like "my car was dirty, so I had to buy a new car" ;) >>> For 50MB buffer mapping, if mtk_iommu driver use iotlb_sync_range() >>> instead of tlb_add_range() and tlb_flush_walk/leaf(), it can increase >>> 50% performance or more(depending on size of every page size) in >>> comparison to flushing after each page entry update. So we prefer to >>> use iotlb_sync_range() to replace iotlb_sync(), tlb_add_range() and >>> tlb_flush_walk/leaf() for MTK platforms. >> >> In the case of mapping, it sounds like what you actually want to do is >> hook up .iotlb_sync_map and generally make IO_PGTABLE_QUIRK_TLBI_ON_MAP >> cleverer, because the current implementation is as dumb as it could >> possibly be. > > iotlb_sync_map only has one parameter(iommu_domain), but mtk > iommu_domain maybe include the whole iova space, if mtk_iommu to do tlb > sync based on iommu_domain, it is equivalent to do tlb flush all in > fact. The first, and so far only, user of this API is the Tegra GART, which can only do a "flush all" operation, so the API is currently only as complex as it needs to be, which is to say "not very". There are plenty of options. > iommu driver will do tlb sync in every mapping page when mtk iommu sets > IO_PGTABLE_QUIRK_TLBI_ON_MAP(io_pgtable_tlb_flush_walk), > as is the commit message mentioned, it will drop mapping performance in > mtk platform. And as I said, that quirk is implemented in a really simplistic way, which is sure to be functionally correct, but has never been given any performance consideration. >> In fact if we simply passed an address range to >> .iotlb_sync_map, io-pgtable probably wouldn't need to be involved at all >> any more. > > I know it is not a good idea probably by adding a new api, but I found > out that tlb sync only to be done after mapping one page, so if > mtk_iommu hope to do tlb sync once after all the pages map done, could > you give me some advices? thanks! Getting rid of IO_PGTABLE_QUIRK_TLBI_ON_MAP and simply wiring .iotlb_sync_map to .flush_iotlb_all is certainly the easiest way to make .map quicker, although it will obviously have *some* impact on any other live mappings. Given that in principle this should have the least CPU overhead, then depending on TLB usage patterns there's a small chance it might actually work out as a reasonable tradeoff, so I wouldn't necessarily rule it out as a viable option without at least trying some tests. If it's cheap to *issue* invalidation commands, and the expense is in waiting for them to actually complete, then you could still fire off an invalidate for each page from .map, then wire up the sync (wait) step to .iotlb_sync_map, still without needing any core changes. Otherwise, it would seem reasonable to pass the complete address and size of the iommu_map() operation to .iotlb_sync_map, so drivers can perform their whole invalidation operation synchronously there. What I *don't* think makes sense is to try passing a gather structure through .map in the same way as for .unmap. That seems a bit too invasive for what is still a fairly exceptional case, and half the stuff that unmap operations will use the gather data for - freelists and such - won't ever be relevant to map operations, so symmetry isn't really an argument either. Robin. >>> Signed-off-by: Chao Hao <chao.hao@mediatek.com> >>> --- >>> drivers/iommu/mtk_iommu.c | 6 ++++++ >>> 1 file changed, 6 insertions(+) >>> >>> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c >>> index 785b228d39a6..d3400c15ff7b 100644 >>> --- a/drivers/iommu/mtk_iommu.c >>> +++ b/drivers/iommu/mtk_iommu.c >>> @@ -224,6 +224,11 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size, >>> } >>> } >>> >>> +static void __mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size) >>> +{ >>> + mtk_iommu_tlb_flush_range_sync(iova, size, 0, NULL) >>> +} >>> + >>> static void mtk_iommu_tlb_flush_page_nosync(struct iommu_iotlb_gather *gather, >>> unsigned long iova, size_t granule, >>> void *cookie) >>> @@ -536,6 +541,7 @@ static const struct iommu_ops mtk_iommu_ops = { >>> .map = mtk_iommu_map, >>> .unmap = mtk_iommu_unmap, >>> .flush_iotlb_all = mtk_iommu_flush_iotlb_all, >>> + .iotlb_sync_range = __mtk_iommu_tlb_flush_range_sync, >>> .iotlb_sync = mtk_iommu_iotlb_sync, >>> .iova_to_phys = mtk_iommu_iova_to_phys, >>> .probe_device = mtk_iommu_probe_device, >>> > > _______________________________________________ > iommu mailing list > iommu@lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/iommu > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 2/4] iommu/mediatek: Add iotlb_sync_range() support @ 2020-10-23 16:07 ` Robin Murphy 0 siblings, 0 replies; 36+ messages in thread From: Robin Murphy @ 2020-10-23 16:07 UTC (permalink / raw) To: chao hao Cc: Jun Wen, FY Yang, wsd_upstream, linux-kernel, iommu, linux-mediatek, Matthias Brugger, Mingyuan Ma, linux-arm-kernel On 2020-10-23 06:57, chao hao wrote: > On Wed, 2020-10-21 at 17:55 +0100, Robin Murphy wrote: >> On 2020-10-19 12:30, Chao Hao wrote: >>> MTK_IOMMU driver writes one page entry and does tlb flush at a time >>> currently. More optimal would be to aggregate the writes and flush >>> BUS buffer in the end. >> >> That's exactly what iommu_iotlb_gather_add_page() is meant to achieve. >> Rather than jumping straight into hacking up a new API to go round the >> back of the existing API design, it would be far better to ask the >> question of why that's not behaving as expected. > > Thanks for you review! > > iommu_iotlb_gather_add_page is put in io_pgtable_tlb_add_page(). > io_pgtable_tlb_add_page() only be called in > unmapping and mapping flow doesn't have it in linux iommu driver, but > mtk iommu needs to do tlb sync in mapping > and unmapping to avoid old data being in the iommu tlb. Right, as I suspected, if it's primarily about the map path then the answer is to use the existing API intended to accommodate that specific case; if that API doesn't quite do what you need, then just fix it! It doesn't make sense to clutter up the IOMMU core with multiple overlapping APIs for TLB invalidation, especially when TLB invalidation ultimately isn't all that complicated. > In addtion, we hope to do tlb sync once when all the pages mapping done. > iommu_iotlb_gather_add_page maybe do > tlb sync more than once. because one whole buffer consists of different > page size(1MB/64K/4K). So if your hardware doesn't care about the granule size used for invalidation matching the underlying mapping, use an implementation of iommu_flush_ops::tlb_add_page that doesn't care about the granule size! > Based on the previous considerations, don't find more appropriate the > way of tlb sync for mtk iommu, so we add a new API. I know I'm probably more familiar with this code than most people, but from my perspective, this reads like "my car was dirty, so I had to buy a new car" ;) >>> For 50MB buffer mapping, if mtk_iommu driver use iotlb_sync_range() >>> instead of tlb_add_range() and tlb_flush_walk/leaf(), it can increase >>> 50% performance or more(depending on size of every page size) in >>> comparison to flushing after each page entry update. So we prefer to >>> use iotlb_sync_range() to replace iotlb_sync(), tlb_add_range() and >>> tlb_flush_walk/leaf() for MTK platforms. >> >> In the case of mapping, it sounds like what you actually want to do is >> hook up .iotlb_sync_map and generally make IO_PGTABLE_QUIRK_TLBI_ON_MAP >> cleverer, because the current implementation is as dumb as it could >> possibly be. > > iotlb_sync_map only has one parameter(iommu_domain), but mtk > iommu_domain maybe include the whole iova space, if mtk_iommu to do tlb > sync based on iommu_domain, it is equivalent to do tlb flush all in > fact. The first, and so far only, user of this API is the Tegra GART, which can only do a "flush all" operation, so the API is currently only as complex as it needs to be, which is to say "not very". There are plenty of options. > iommu driver will do tlb sync in every mapping page when mtk iommu sets > IO_PGTABLE_QUIRK_TLBI_ON_MAP(io_pgtable_tlb_flush_walk), > as is the commit message mentioned, it will drop mapping performance in > mtk platform. And as I said, that quirk is implemented in a really simplistic way, which is sure to be functionally correct, but has never been given any performance consideration. >> In fact if we simply passed an address range to >> .iotlb_sync_map, io-pgtable probably wouldn't need to be involved at all >> any more. > > I know it is not a good idea probably by adding a new api, but I found > out that tlb sync only to be done after mapping one page, so if > mtk_iommu hope to do tlb sync once after all the pages map done, could > you give me some advices? thanks! Getting rid of IO_PGTABLE_QUIRK_TLBI_ON_MAP and simply wiring .iotlb_sync_map to .flush_iotlb_all is certainly the easiest way to make .map quicker, although it will obviously have *some* impact on any other live mappings. Given that in principle this should have the least CPU overhead, then depending on TLB usage patterns there's a small chance it might actually work out as a reasonable tradeoff, so I wouldn't necessarily rule it out as a viable option without at least trying some tests. If it's cheap to *issue* invalidation commands, and the expense is in waiting for them to actually complete, then you could still fire off an invalidate for each page from .map, then wire up the sync (wait) step to .iotlb_sync_map, still without needing any core changes. Otherwise, it would seem reasonable to pass the complete address and size of the iommu_map() operation to .iotlb_sync_map, so drivers can perform their whole invalidation operation synchronously there. What I *don't* think makes sense is to try passing a gather structure through .map in the same way as for .unmap. That seems a bit too invasive for what is still a fairly exceptional case, and half the stuff that unmap operations will use the gather data for - freelists and such - won't ever be relevant to map operations, so symmetry isn't really an argument either. Robin. >>> Signed-off-by: Chao Hao <chao.hao@mediatek.com> >>> --- >>> drivers/iommu/mtk_iommu.c | 6 ++++++ >>> 1 file changed, 6 insertions(+) >>> >>> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c >>> index 785b228d39a6..d3400c15ff7b 100644 >>> --- a/drivers/iommu/mtk_iommu.c >>> +++ b/drivers/iommu/mtk_iommu.c >>> @@ -224,6 +224,11 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size, >>> } >>> } >>> >>> +static void __mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size) >>> +{ >>> + mtk_iommu_tlb_flush_range_sync(iova, size, 0, NULL) >>> +} >>> + >>> static void mtk_iommu_tlb_flush_page_nosync(struct iommu_iotlb_gather *gather, >>> unsigned long iova, size_t granule, >>> void *cookie) >>> @@ -536,6 +541,7 @@ static const struct iommu_ops mtk_iommu_ops = { >>> .map = mtk_iommu_map, >>> .unmap = mtk_iommu_unmap, >>> .flush_iotlb_all = mtk_iommu_flush_iotlb_all, >>> + .iotlb_sync_range = __mtk_iommu_tlb_flush_range_sync, >>> .iotlb_sync = mtk_iommu_iotlb_sync, >>> .iova_to_phys = mtk_iommu_iova_to_phys, >>> .probe_device = mtk_iommu_probe_device, >>> > > _______________________________________________ > iommu mailing list > iommu@lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/iommu > _______________________________________________ Linux-mediatek mailing list Linux-mediatek@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-mediatek ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH 3/4] iommu/mediatek: Remove unnecessary tlb sync 2020-10-19 11:30 ` Chao Hao (?) (?) @ 2020-10-19 11:30 ` Chao Hao -1 siblings, 0 replies; 36+ messages in thread From: Chao Hao @ 2020-10-19 11:30 UTC (permalink / raw) To: Joerg Roedel, Matthias Brugger Cc: Jun Wen, FY Yang, wsd_upstream, linux-kernel, Chao Hao, iommu, linux-mediatek, linux-arm-kernel, Mingyuan Ma As is "[PATCH 2/4]" described, we will use iotlb_sync_range() to replace iotlb_sync(), tlb_add_range() and tlb_flush_walk/leaf() to enhance performance. So we will remove the implementation of iotlb_sync(), tlb_add_range() and tlb_flush_walk/leaf(). Signed-off-by: Chao Hao <chao.hao@mediatek.com> --- drivers/iommu/mtk_iommu.c | 28 ++++------------------------ 1 file changed, 4 insertions(+), 24 deletions(-) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index d3400c15ff7b..bca1f53c0ab9 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -229,21 +229,15 @@ static void __mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size) mtk_iommu_tlb_flush_range_sync(iova, size, 0, NULL) } -static void mtk_iommu_tlb_flush_page_nosync(struct iommu_iotlb_gather *gather, - unsigned long iova, size_t granule, - void *cookie) +static void mtk_iommu_tlb_flush_skip(unsigned long iova, size_t size, + size_t granule, void *cookie) { - struct mtk_iommu_data *data = cookie; - struct iommu_domain *domain = &data->m4u_dom->domain; - - iommu_iotlb_gather_add_page(domain, gather, iova, granule); } static const struct iommu_flush_ops mtk_iommu_flush_ops = { .tlb_flush_all = mtk_iommu_tlb_flush_all, - .tlb_flush_walk = mtk_iommu_tlb_flush_range_sync, - .tlb_flush_leaf = mtk_iommu_tlb_flush_range_sync, - .tlb_add_page = mtk_iommu_tlb_flush_page_nosync, + .tlb_flush_walk = mtk_iommu_tlb_flush_skip, + .tlb_flush_leaf = mtk_iommu_tlb_flush_skip, }; static irqreturn_t mtk_iommu_isr(int irq, void *dev_id) @@ -443,19 +437,6 @@ static void mtk_iommu_flush_iotlb_all(struct iommu_domain *domain) mtk_iommu_tlb_flush_all(mtk_iommu_get_m4u_data()); } -static void mtk_iommu_iotlb_sync(struct iommu_domain *domain, - struct iommu_iotlb_gather *gather) -{ - struct mtk_iommu_data *data = mtk_iommu_get_m4u_data(); - size_t length = gather->end - gather->start; - - if (gather->start == ULONG_MAX) - return; - - mtk_iommu_tlb_flush_range_sync(gather->start, length, gather->pgsize, - data); -} - static phys_addr_t mtk_iommu_iova_to_phys(struct iommu_domain *domain, dma_addr_t iova) { @@ -542,7 +523,6 @@ static const struct iommu_ops mtk_iommu_ops = { .unmap = mtk_iommu_unmap, .flush_iotlb_all = mtk_iommu_flush_iotlb_all, .iotlb_sync_range = __mtk_iommu_tlb_flush_range_sync, - .iotlb_sync = mtk_iommu_iotlb_sync, .iova_to_phys = mtk_iommu_iova_to_phys, .probe_device = mtk_iommu_probe_device, .release_device = mtk_iommu_release_device, -- 2.18.0 _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 3/4] iommu/mediatek: Remove unnecessary tlb sync @ 2020-10-19 11:30 ` Chao Hao 0 siblings, 0 replies; 36+ messages in thread From: Chao Hao @ 2020-10-19 11:30 UTC (permalink / raw) To: Joerg Roedel, Matthias Brugger Cc: iommu, linux-kernel, linux-arm-kernel, linux-mediatek, wsd_upstream, Yong Wu, FY Yang, Jun Wen, Mingyuan Ma, Chao Hao As is "[PATCH 2/4]" described, we will use iotlb_sync_range() to replace iotlb_sync(), tlb_add_range() and tlb_flush_walk/leaf() to enhance performance. So we will remove the implementation of iotlb_sync(), tlb_add_range() and tlb_flush_walk/leaf(). Signed-off-by: Chao Hao <chao.hao@mediatek.com> --- drivers/iommu/mtk_iommu.c | 28 ++++------------------------ 1 file changed, 4 insertions(+), 24 deletions(-) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index d3400c15ff7b..bca1f53c0ab9 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -229,21 +229,15 @@ static void __mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size) mtk_iommu_tlb_flush_range_sync(iova, size, 0, NULL) } -static void mtk_iommu_tlb_flush_page_nosync(struct iommu_iotlb_gather *gather, - unsigned long iova, size_t granule, - void *cookie) +static void mtk_iommu_tlb_flush_skip(unsigned long iova, size_t size, + size_t granule, void *cookie) { - struct mtk_iommu_data *data = cookie; - struct iommu_domain *domain = &data->m4u_dom->domain; - - iommu_iotlb_gather_add_page(domain, gather, iova, granule); } static const struct iommu_flush_ops mtk_iommu_flush_ops = { .tlb_flush_all = mtk_iommu_tlb_flush_all, - .tlb_flush_walk = mtk_iommu_tlb_flush_range_sync, - .tlb_flush_leaf = mtk_iommu_tlb_flush_range_sync, - .tlb_add_page = mtk_iommu_tlb_flush_page_nosync, + .tlb_flush_walk = mtk_iommu_tlb_flush_skip, + .tlb_flush_leaf = mtk_iommu_tlb_flush_skip, }; static irqreturn_t mtk_iommu_isr(int irq, void *dev_id) @@ -443,19 +437,6 @@ static void mtk_iommu_flush_iotlb_all(struct iommu_domain *domain) mtk_iommu_tlb_flush_all(mtk_iommu_get_m4u_data()); } -static void mtk_iommu_iotlb_sync(struct iommu_domain *domain, - struct iommu_iotlb_gather *gather) -{ - struct mtk_iommu_data *data = mtk_iommu_get_m4u_data(); - size_t length = gather->end - gather->start; - - if (gather->start == ULONG_MAX) - return; - - mtk_iommu_tlb_flush_range_sync(gather->start, length, gather->pgsize, - data); -} - static phys_addr_t mtk_iommu_iova_to_phys(struct iommu_domain *domain, dma_addr_t iova) { @@ -542,7 +523,6 @@ static const struct iommu_ops mtk_iommu_ops = { .unmap = mtk_iommu_unmap, .flush_iotlb_all = mtk_iommu_flush_iotlb_all, .iotlb_sync_range = __mtk_iommu_tlb_flush_range_sync, - .iotlb_sync = mtk_iommu_iotlb_sync, .iova_to_phys = mtk_iommu_iova_to_phys, .probe_device = mtk_iommu_probe_device, .release_device = mtk_iommu_release_device, -- 2.18.0 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 3/4] iommu/mediatek: Remove unnecessary tlb sync @ 2020-10-19 11:30 ` Chao Hao 0 siblings, 0 replies; 36+ messages in thread From: Chao Hao @ 2020-10-19 11:30 UTC (permalink / raw) To: Joerg Roedel, Matthias Brugger Cc: Jun Wen, FY Yang, wsd_upstream, linux-kernel, Chao Hao, iommu, linux-mediatek, linux-arm-kernel, Mingyuan Ma, Yong Wu As is "[PATCH 2/4]" described, we will use iotlb_sync_range() to replace iotlb_sync(), tlb_add_range() and tlb_flush_walk/leaf() to enhance performance. So we will remove the implementation of iotlb_sync(), tlb_add_range() and tlb_flush_walk/leaf(). Signed-off-by: Chao Hao <chao.hao@mediatek.com> --- drivers/iommu/mtk_iommu.c | 28 ++++------------------------ 1 file changed, 4 insertions(+), 24 deletions(-) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index d3400c15ff7b..bca1f53c0ab9 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -229,21 +229,15 @@ static void __mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size) mtk_iommu_tlb_flush_range_sync(iova, size, 0, NULL) } -static void mtk_iommu_tlb_flush_page_nosync(struct iommu_iotlb_gather *gather, - unsigned long iova, size_t granule, - void *cookie) +static void mtk_iommu_tlb_flush_skip(unsigned long iova, size_t size, + size_t granule, void *cookie) { - struct mtk_iommu_data *data = cookie; - struct iommu_domain *domain = &data->m4u_dom->domain; - - iommu_iotlb_gather_add_page(domain, gather, iova, granule); } static const struct iommu_flush_ops mtk_iommu_flush_ops = { .tlb_flush_all = mtk_iommu_tlb_flush_all, - .tlb_flush_walk = mtk_iommu_tlb_flush_range_sync, - .tlb_flush_leaf = mtk_iommu_tlb_flush_range_sync, - .tlb_add_page = mtk_iommu_tlb_flush_page_nosync, + .tlb_flush_walk = mtk_iommu_tlb_flush_skip, + .tlb_flush_leaf = mtk_iommu_tlb_flush_skip, }; static irqreturn_t mtk_iommu_isr(int irq, void *dev_id) @@ -443,19 +437,6 @@ static void mtk_iommu_flush_iotlb_all(struct iommu_domain *domain) mtk_iommu_tlb_flush_all(mtk_iommu_get_m4u_data()); } -static void mtk_iommu_iotlb_sync(struct iommu_domain *domain, - struct iommu_iotlb_gather *gather) -{ - struct mtk_iommu_data *data = mtk_iommu_get_m4u_data(); - size_t length = gather->end - gather->start; - - if (gather->start == ULONG_MAX) - return; - - mtk_iommu_tlb_flush_range_sync(gather->start, length, gather->pgsize, - data); -} - static phys_addr_t mtk_iommu_iova_to_phys(struct iommu_domain *domain, dma_addr_t iova) { @@ -542,7 +523,6 @@ static const struct iommu_ops mtk_iommu_ops = { .unmap = mtk_iommu_unmap, .flush_iotlb_all = mtk_iommu_flush_iotlb_all, .iotlb_sync_range = __mtk_iommu_tlb_flush_range_sync, - .iotlb_sync = mtk_iommu_iotlb_sync, .iova_to_phys = mtk_iommu_iova_to_phys, .probe_device = mtk_iommu_probe_device, .release_device = mtk_iommu_release_device, -- 2.18.0 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 3/4] iommu/mediatek: Remove unnecessary tlb sync @ 2020-10-19 11:30 ` Chao Hao 0 siblings, 0 replies; 36+ messages in thread From: Chao Hao @ 2020-10-19 11:30 UTC (permalink / raw) To: Joerg Roedel, Matthias Brugger Cc: Jun Wen, FY Yang, wsd_upstream, linux-kernel, Chao Hao, iommu, linux-mediatek, linux-arm-kernel, Mingyuan Ma, Yong Wu As is "[PATCH 2/4]" described, we will use iotlb_sync_range() to replace iotlb_sync(), tlb_add_range() and tlb_flush_walk/leaf() to enhance performance. So we will remove the implementation of iotlb_sync(), tlb_add_range() and tlb_flush_walk/leaf(). Signed-off-by: Chao Hao <chao.hao@mediatek.com> --- drivers/iommu/mtk_iommu.c | 28 ++++------------------------ 1 file changed, 4 insertions(+), 24 deletions(-) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index d3400c15ff7b..bca1f53c0ab9 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -229,21 +229,15 @@ static void __mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size) mtk_iommu_tlb_flush_range_sync(iova, size, 0, NULL) } -static void mtk_iommu_tlb_flush_page_nosync(struct iommu_iotlb_gather *gather, - unsigned long iova, size_t granule, - void *cookie) +static void mtk_iommu_tlb_flush_skip(unsigned long iova, size_t size, + size_t granule, void *cookie) { - struct mtk_iommu_data *data = cookie; - struct iommu_domain *domain = &data->m4u_dom->domain; - - iommu_iotlb_gather_add_page(domain, gather, iova, granule); } static const struct iommu_flush_ops mtk_iommu_flush_ops = { .tlb_flush_all = mtk_iommu_tlb_flush_all, - .tlb_flush_walk = mtk_iommu_tlb_flush_range_sync, - .tlb_flush_leaf = mtk_iommu_tlb_flush_range_sync, - .tlb_add_page = mtk_iommu_tlb_flush_page_nosync, + .tlb_flush_walk = mtk_iommu_tlb_flush_skip, + .tlb_flush_leaf = mtk_iommu_tlb_flush_skip, }; static irqreturn_t mtk_iommu_isr(int irq, void *dev_id) @@ -443,19 +437,6 @@ static void mtk_iommu_flush_iotlb_all(struct iommu_domain *domain) mtk_iommu_tlb_flush_all(mtk_iommu_get_m4u_data()); } -static void mtk_iommu_iotlb_sync(struct iommu_domain *domain, - struct iommu_iotlb_gather *gather) -{ - struct mtk_iommu_data *data = mtk_iommu_get_m4u_data(); - size_t length = gather->end - gather->start; - - if (gather->start == ULONG_MAX) - return; - - mtk_iommu_tlb_flush_range_sync(gather->start, length, gather->pgsize, - data); -} - static phys_addr_t mtk_iommu_iova_to_phys(struct iommu_domain *domain, dma_addr_t iova) { @@ -542,7 +523,6 @@ static const struct iommu_ops mtk_iommu_ops = { .unmap = mtk_iommu_unmap, .flush_iotlb_all = mtk_iommu_flush_iotlb_all, .iotlb_sync_range = __mtk_iommu_tlb_flush_range_sync, - .iotlb_sync = mtk_iommu_iotlb_sync, .iova_to_phys = mtk_iommu_iova_to_phys, .probe_device = mtk_iommu_probe_device, .release_device = mtk_iommu_release_device, -- 2.18.0 _______________________________________________ Linux-mediatek mailing list Linux-mediatek@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-mediatek ^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 4/4] iommu/mediatek: Adjust iotlb_sync_range 2020-10-19 11:30 ` Chao Hao (?) (?) @ 2020-10-19 11:31 ` Chao Hao -1 siblings, 0 replies; 36+ messages in thread From: Chao Hao @ 2020-10-19 11:31 UTC (permalink / raw) To: Joerg Roedel, Matthias Brugger Cc: Jun Wen, FY Yang, wsd_upstream, linux-kernel, Chao Hao, iommu, linux-mediatek, linux-arm-kernel, Mingyuan Ma As is title, the patch only adjusts the architecture of iotlb_sync_range(). No functional change. Signed-off-by: Chao Hao <chao.hao@mediatek.com> --- drivers/iommu/mtk_iommu.c | 14 ++++---------- 1 file changed, 4 insertions(+), 10 deletions(-) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index bca1f53c0ab9..66e5b9d3c575 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -191,10 +191,9 @@ static void mtk_iommu_tlb_flush_all(void *cookie) } } -static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size, - size_t granule, void *cookie) +static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size) { - struct mtk_iommu_data *data = cookie; + struct mtk_iommu_data *data; unsigned long flags; int ret; u32 tmp; @@ -216,7 +215,7 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size, if (ret) { dev_warn(data->dev, "Partial TLB flush timed out, falling back to full flush\n"); - mtk_iommu_tlb_flush_all(cookie); + mtk_iommu_tlb_flush_all(data); } /* Clear the CPE status */ writel_relaxed(0, data->base + REG_MMU_CPE_DONE); @@ -224,11 +223,6 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size, } } -static void __mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size) -{ - mtk_iommu_tlb_flush_range_sync(iova, size, 0, NULL) -} - static void mtk_iommu_tlb_flush_skip(unsigned long iova, size_t size, size_t granule, void *cookie) { @@ -522,7 +516,7 @@ static const struct iommu_ops mtk_iommu_ops = { .map = mtk_iommu_map, .unmap = mtk_iommu_unmap, .flush_iotlb_all = mtk_iommu_flush_iotlb_all, - .iotlb_sync_range = __mtk_iommu_tlb_flush_range_sync, + .iotlb_sync_range = mtk_iommu_tlb_flush_range_sync, .iova_to_phys = mtk_iommu_iova_to_phys, .probe_device = mtk_iommu_probe_device, .release_device = mtk_iommu_release_device, -- 2.18.0 _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 4/4] iommu/mediatek: Adjust iotlb_sync_range @ 2020-10-19 11:31 ` Chao Hao 0 siblings, 0 replies; 36+ messages in thread From: Chao Hao @ 2020-10-19 11:31 UTC (permalink / raw) To: Joerg Roedel, Matthias Brugger Cc: iommu, linux-kernel, linux-arm-kernel, linux-mediatek, wsd_upstream, Yong Wu, FY Yang, Jun Wen, Mingyuan Ma, Chao Hao As is title, the patch only adjusts the architecture of iotlb_sync_range(). No functional change. Signed-off-by: Chao Hao <chao.hao@mediatek.com> --- drivers/iommu/mtk_iommu.c | 14 ++++---------- 1 file changed, 4 insertions(+), 10 deletions(-) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index bca1f53c0ab9..66e5b9d3c575 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -191,10 +191,9 @@ static void mtk_iommu_tlb_flush_all(void *cookie) } } -static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size, - size_t granule, void *cookie) +static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size) { - struct mtk_iommu_data *data = cookie; + struct mtk_iommu_data *data; unsigned long flags; int ret; u32 tmp; @@ -216,7 +215,7 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size, if (ret) { dev_warn(data->dev, "Partial TLB flush timed out, falling back to full flush\n"); - mtk_iommu_tlb_flush_all(cookie); + mtk_iommu_tlb_flush_all(data); } /* Clear the CPE status */ writel_relaxed(0, data->base + REG_MMU_CPE_DONE); @@ -224,11 +223,6 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size, } } -static void __mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size) -{ - mtk_iommu_tlb_flush_range_sync(iova, size, 0, NULL) -} - static void mtk_iommu_tlb_flush_skip(unsigned long iova, size_t size, size_t granule, void *cookie) { @@ -522,7 +516,7 @@ static const struct iommu_ops mtk_iommu_ops = { .map = mtk_iommu_map, .unmap = mtk_iommu_unmap, .flush_iotlb_all = mtk_iommu_flush_iotlb_all, - .iotlb_sync_range = __mtk_iommu_tlb_flush_range_sync, + .iotlb_sync_range = mtk_iommu_tlb_flush_range_sync, .iova_to_phys = mtk_iommu_iova_to_phys, .probe_device = mtk_iommu_probe_device, .release_device = mtk_iommu_release_device, -- 2.18.0 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 4/4] iommu/mediatek: Adjust iotlb_sync_range @ 2020-10-19 11:31 ` Chao Hao 0 siblings, 0 replies; 36+ messages in thread From: Chao Hao @ 2020-10-19 11:31 UTC (permalink / raw) To: Joerg Roedel, Matthias Brugger Cc: Jun Wen, FY Yang, wsd_upstream, linux-kernel, Chao Hao, iommu, linux-mediatek, linux-arm-kernel, Mingyuan Ma, Yong Wu As is title, the patch only adjusts the architecture of iotlb_sync_range(). No functional change. Signed-off-by: Chao Hao <chao.hao@mediatek.com> --- drivers/iommu/mtk_iommu.c | 14 ++++---------- 1 file changed, 4 insertions(+), 10 deletions(-) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index bca1f53c0ab9..66e5b9d3c575 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -191,10 +191,9 @@ static void mtk_iommu_tlb_flush_all(void *cookie) } } -static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size, - size_t granule, void *cookie) +static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size) { - struct mtk_iommu_data *data = cookie; + struct mtk_iommu_data *data; unsigned long flags; int ret; u32 tmp; @@ -216,7 +215,7 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size, if (ret) { dev_warn(data->dev, "Partial TLB flush timed out, falling back to full flush\n"); - mtk_iommu_tlb_flush_all(cookie); + mtk_iommu_tlb_flush_all(data); } /* Clear the CPE status */ writel_relaxed(0, data->base + REG_MMU_CPE_DONE); @@ -224,11 +223,6 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size, } } -static void __mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size) -{ - mtk_iommu_tlb_flush_range_sync(iova, size, 0, NULL) -} - static void mtk_iommu_tlb_flush_skip(unsigned long iova, size_t size, size_t granule, void *cookie) { @@ -522,7 +516,7 @@ static const struct iommu_ops mtk_iommu_ops = { .map = mtk_iommu_map, .unmap = mtk_iommu_unmap, .flush_iotlb_all = mtk_iommu_flush_iotlb_all, - .iotlb_sync_range = __mtk_iommu_tlb_flush_range_sync, + .iotlb_sync_range = mtk_iommu_tlb_flush_range_sync, .iova_to_phys = mtk_iommu_iova_to_phys, .probe_device = mtk_iommu_probe_device, .release_device = mtk_iommu_release_device, -- 2.18.0 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 4/4] iommu/mediatek: Adjust iotlb_sync_range @ 2020-10-19 11:31 ` Chao Hao 0 siblings, 0 replies; 36+ messages in thread From: Chao Hao @ 2020-10-19 11:31 UTC (permalink / raw) To: Joerg Roedel, Matthias Brugger Cc: Jun Wen, FY Yang, wsd_upstream, linux-kernel, Chao Hao, iommu, linux-mediatek, linux-arm-kernel, Mingyuan Ma, Yong Wu As is title, the patch only adjusts the architecture of iotlb_sync_range(). No functional change. Signed-off-by: Chao Hao <chao.hao@mediatek.com> --- drivers/iommu/mtk_iommu.c | 14 ++++---------- 1 file changed, 4 insertions(+), 10 deletions(-) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index bca1f53c0ab9..66e5b9d3c575 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -191,10 +191,9 @@ static void mtk_iommu_tlb_flush_all(void *cookie) } } -static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size, - size_t granule, void *cookie) +static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size) { - struct mtk_iommu_data *data = cookie; + struct mtk_iommu_data *data; unsigned long flags; int ret; u32 tmp; @@ -216,7 +215,7 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size, if (ret) { dev_warn(data->dev, "Partial TLB flush timed out, falling back to full flush\n"); - mtk_iommu_tlb_flush_all(cookie); + mtk_iommu_tlb_flush_all(data); } /* Clear the CPE status */ writel_relaxed(0, data->base + REG_MMU_CPE_DONE); @@ -224,11 +223,6 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size, } } -static void __mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size) -{ - mtk_iommu_tlb_flush_range_sync(iova, size, 0, NULL) -} - static void mtk_iommu_tlb_flush_skip(unsigned long iova, size_t size, size_t granule, void *cookie) { @@ -522,7 +516,7 @@ static const struct iommu_ops mtk_iommu_ops = { .map = mtk_iommu_map, .unmap = mtk_iommu_unmap, .flush_iotlb_all = mtk_iommu_flush_iotlb_all, - .iotlb_sync_range = __mtk_iommu_tlb_flush_range_sync, + .iotlb_sync_range = mtk_iommu_tlb_flush_range_sync, .iova_to_phys = mtk_iommu_iova_to_phys, .probe_device = mtk_iommu_probe_device, .release_device = mtk_iommu_release_device, -- 2.18.0 _______________________________________________ Linux-mediatek mailing list Linux-mediatek@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-mediatek ^ permalink raw reply related [flat|nested] 36+ messages in thread
end of thread, other threads:[~2020-10-23 16:09 UTC | newest] Thread overview: 36+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2020-10-19 11:30 [PATCH 0/4] MTK_IOMMU: Optimize mapping / unmapping performance Chao Hao 2020-10-19 11:30 ` Chao Hao 2020-10-19 11:30 ` Chao Hao 2020-10-19 11:30 ` Chao Hao 2020-10-19 11:30 ` [PATCH 1/4] iommu: Introduce iotlb_sync_range callback Chao Hao 2020-10-19 11:30 ` Chao Hao 2020-10-19 11:30 ` Chao Hao 2020-10-19 11:30 ` Chao Hao 2020-10-19 11:30 ` [PATCH 2/4] iommu/mediatek: Add iotlb_sync_range() support Chao Hao 2020-10-19 11:30 ` Chao Hao 2020-10-19 11:30 ` Chao Hao 2020-10-19 11:30 ` Chao Hao 2020-10-21 16:55 ` Robin Murphy 2020-10-21 16:55 ` Robin Murphy 2020-10-21 16:55 ` Robin Murphy 2020-10-21 16:55 ` Robin Murphy 2020-10-23 5:57 ` chao hao 2020-10-23 5:57 ` chao hao 2020-10-23 5:57 ` chao hao 2020-10-23 5:57 ` chao hao 2020-10-23 6:04 ` chao hao 2020-10-23 6:04 ` chao hao 2020-10-23 6:04 ` chao hao 2020-10-23 6:04 ` chao hao 2020-10-23 16:07 ` Robin Murphy 2020-10-23 16:07 ` Robin Murphy 2020-10-23 16:07 ` Robin Murphy 2020-10-23 16:07 ` Robin Murphy 2020-10-19 11:30 ` [PATCH 3/4] iommu/mediatek: Remove unnecessary tlb sync Chao Hao 2020-10-19 11:30 ` Chao Hao 2020-10-19 11:30 ` Chao Hao 2020-10-19 11:30 ` Chao Hao 2020-10-19 11:31 ` [PATCH 4/4] iommu/mediatek: Adjust iotlb_sync_range Chao Hao 2020-10-19 11:31 ` Chao Hao 2020-10-19 11:31 ` Chao Hao 2020-10-19 11:31 ` Chao Hao
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.