From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751473AbdJSJMY (ORCPT ); Thu, 19 Oct 2017 05:12:24 -0400 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:50012 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750939AbdJSJMV (ORCPT ); Thu, 19 Oct 2017 05:12:21 -0400 Date: Thu, 19 Oct 2017 10:12:26 +0100 From: Will Deacon To: "Leizhen (ThunderTown)" Cc: Joerg Roedel , linux-arm-kernel , iommu , Robin Murphy , linux-kernel , Hanjun Guo , Libin , Jinyue Li , Kefeng Wang Subject: Re: [PATCH v2 1/3] iommu/arm-smmu-v3: put off the execution of TLBI* to reduce lock confliction Message-ID: <20171019091225.GA29762@arm.com> References: <1505221238-9428-1-git-send-email-thunder.leizhen@huawei.com> <1505221238-9428-2-git-send-email-thunder.leizhen@huawei.com> <20171018125849.GD4077@arm.com> <59E8155D.2070102@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <59E8155D.2070102@huawei.com> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Oct 19, 2017 at 11:00:45AM +0800, Leizhen (ThunderTown) wrote: > > > On 2017/10/18 20:58, Will Deacon wrote: > > Hi Thunder, > > > > On Tue, Sep 12, 2017 at 09:00:36PM +0800, Zhen Lei wrote: > >> Because all TLBI commands should be followed by a SYNC command, to make > >> sure that it has been completely finished. So we can just add the TLBI > >> commands into the queue, and put off the execution until meet SYNC or > >> other commands. To prevent the followed SYNC command waiting for a long > >> time because of too many commands have been delayed, restrict the max > >> delayed number. > >> > >> According to my test, I got the same performance data as I replaced writel > >> with writel_relaxed in queue_inc_prod. > >> > >> Signed-off-by: Zhen Lei > >> --- > >> drivers/iommu/arm-smmu-v3.c | 42 +++++++++++++++++++++++++++++++++++++----- > >> 1 file changed, 37 insertions(+), 5 deletions(-) > > > > If we want to go down the route of explicit command batching, I'd much > > rather do it by implementing the iotlb_range_add callback in the driver, > > and have a fixed-length array of batched ranges on the domain. We could > I think even if iotlb_range_add callback is implemented, this patch is still valuable. The main purpose > of this patch is to reduce dsb operation. So in the scenario with iotlb_range_add implemented: > .iotlb_range_add: > spin_lock_irqsave(&smmu->cmdq.lock, flags); > ... > add tlbi range-1 to cmq-queue > ... > add tlbi range-n to cmq-queue //n > dsb > ... > spin_unlock_irqrestore(&smmu->cmdq.lock, flags); > > .iotlb_sync > spin_lock_irqsave(&smmu->cmdq.lock, flags); > ... > add cmd_sync to cmq-queue > dsb > ... > spin_unlock_irqrestore(&smmu->cmdq.lock, flags); > > Although iotlb_range_add can reduce n-1 dsb operations, but there are > still 1 left. If n is not large enough, this patch is helpful. Then pick an n that is large enough, based on the compatible string. Will