Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC 0/2] iommu/arm-smmu-v3: Improve cmdq lock efficiency
@ 2020-06-01 11:50 John Garry
  2020-06-01 11:50 ` [PATCH RFC 1/2] iommu/arm-smmu-v3: Calculate bits for prod and owner John Garry
  2020-06-01 11:50 ` [PATCH RFC 2/2] iommu/arm-smmu-v3: Remove cmpxchg() in arm_smmu_cmdq_issue_cmdlist() John Garry
  0 siblings, 2 replies; 3+ messages in thread
From: John Garry @ 2020-06-01 11:50 UTC (permalink / raw)
  To: will, robin.murphy
  Cc: song.bao.hua, maz, joro, John Garry, iommu, linux-arm-kernel

As mentioned in [0], the CPU may consume many cycles processing
arm_smmu_cmdq_issue_cmdlist(). One issue we find is the cmpxchg() loop to
get space on the queue takes approx 25% of the cycles for this function.

The cmpxchg() is removed as follows:
- We assume that the cmdq can never fill with changes to limit the
  batch size (where necessary) and always issue a CMD_SYNC for a batch
  We need to do this since we no longer maintain the cons value in
  software, and we cannot deal with no available space properly.
- Replace cmpxchg() with atomic inc operation, to maintain the prod
  and owner values.

Early experiments have shown that we may see a 25% boost in throughput
IOPS for my NVMe test with these changes. And some CPUs, which were
loaded at ~55%, now see a ~45% load.

So, even though the changes are incomplete and other parts of the driver
will need fixing up (and it looks maybe broken for !MSI support), the
performance boost seen would seem to be worth the effort of exploring
this.

Comments requested please.

Thanks

[0] https://lore.kernel.org/linux-iommu/B926444035E5E2439431908E3842AFD24B86DB@DGGEMI525-MBS.china.huawei.com/T/#ma02e301c38c3e94b7725e685757c27e39c7cbde3

John Garry (2):
  iommu/arm-smmu-v3: Calculate bits for prod and owner
  iommu/arm-smmu-v3: Remove cmpxchg() in arm_smmu_cmdq_issue_cmdlist()

 drivers/iommu/arm-smmu-v3.c | 92 +++++++++++++++++++++++----------------------
 1 file changed, 47 insertions(+), 45 deletions(-)

-- 
2.16.4


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-06-01 11:55 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-06-01 11:50 [PATCH RFC 0/2] iommu/arm-smmu-v3: Improve cmdq lock efficiency John Garry
2020-06-01 11:50 ` [PATCH RFC 1/2] iommu/arm-smmu-v3: Calculate bits for prod and owner John Garry
2020-06-01 11:50 ` [PATCH RFC 2/2] iommu/arm-smmu-v3: Remove cmpxchg() in arm_smmu_cmdq_issue_cmdlist() John Garry

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox