From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.1 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 29C4FC33CB2 for ; Wed, 15 Jan 2020 16:33:20 +0000 (UTC) Received: from hemlock.osuosl.org (smtp2.osuosl.org [140.211.166.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id E962E24656 for ; Wed, 15 Jan 2020 16:33:19 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="NQr+44Tw" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E962E24656 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=iommu-bounces@lists.linux-foundation.org Received: from localhost (localhost [127.0.0.1]) by hemlock.osuosl.org (Postfix) with ESMTP id BCB61865B8; Wed, 15 Jan 2020 16:33:19 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from hemlock.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id uZvL9JDDhccH; Wed, 15 Jan 2020 16:33:16 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by hemlock.osuosl.org (Postfix) with ESMTP id EA179862B2; Wed, 15 Jan 2020 16:33:15 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id CC4BBC1D82; Wed, 15 Jan 2020 16:33:15 +0000 (UTC) Received: from fraxinus.osuosl.org (smtp4.osuosl.org [140.211.166.137]) by lists.linuxfoundation.org (Postfix) with ESMTP id 637FFC077D for ; Wed, 15 Jan 2020 16:33:14 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by fraxinus.osuosl.org (Postfix) with ESMTP id 516E685E25 for ; Wed, 15 Jan 2020 16:33:14 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from fraxinus.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ce08UjhS4rcl for ; Wed, 15 Jan 2020 16:33:10 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from us-smtp-delivery-1.mimecast.com (us-smtp-1.mimecast.com [205.139.110.61]) by fraxinus.osuosl.org (Postfix) with ESMTPS id 127A185E24 for ; Wed, 15 Jan 2020 16:33:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1579105988; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=IfKcg2RJIAeBY1dOct65tL/7hpdrsAPuh4+8nk4PQ8Q=; b=NQr+44TwCTc5zMLRN/jVQn4ZqFuqUSrDxT9xxiphkVKA9GxpOtXloI/kh87ic7bjt2GSlG HlEpqA2r/UwvNzf63BJhfgO04TL3H+ceO0uO4A6b+1zDhkPCx1GTjNQTRwHuyqVIh1K//G x7dcz0GhhF2Z1zrnxj8vP/yFzSlJkCI= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-276-X2SmgdIHMiWD2c5Jdrk1FQ-1; Wed, 15 Jan 2020 11:33:02 -0500 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 3FDBD800EBF; Wed, 15 Jan 2020 16:33:01 +0000 (UTC) Received: from [10.36.117.108] (ovpn-117-108.ams2.redhat.com [10.36.117.108]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 2D7C519757; Wed, 15 Jan 2020 16:32:58 +0000 (UTC) Subject: Re: [PATCH] iommu/arm-smmu-v3: Add SMMUv3.2 range invalidation support To: Rob Herring References: <20200113143924.11576-1-robh@kernel.org> <2ee87a12-1a0e-bd48-0209-b5e205342d44@redhat.com> From: Auger Eric Message-ID: Date: Wed, 15 Jan 2020 17:32:57 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.4.0 MIME-Version: 1.0 In-Reply-To: Content-Language: en-US X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-MC-Unique: X2SmgdIHMiWD2c5Jdrk1FQ-1 X-Mimecast-Spam-Score: 0 Cc: Jean-Philippe Brucker , Will Deacon , Linux IOMMU , Robin Murphy , "moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE" X-BeenThere: iommu@lists.linux-foundation.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: Development issues for Linux IOMMU support List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: iommu-bounces@lists.linux-foundation.org Sender: "iommu" Hi Rob, On 1/15/20 3:02 PM, Rob Herring wrote: > On Wed, Jan 15, 2020 at 3:21 AM Auger Eric wrote: >> >> Hi Rob, >> >> On 1/13/20 3:39 PM, Rob Herring wrote: >>> Arm SMMUv3.2 adds support for TLB range invalidate operations. >>> Support for range invalidate is determined by the RIL bit in the IDR3 >>> register. >>> >>> The range invalidate is in units of the leaf page size and operates on >>> 1-32 chunks of a power of 2 multiple pages. First we determine from the >>> size what power of 2 multiple we can use and then adjust the granule to >>> 32x that size. >>> >>> Cc: Eric Auger >>> Cc: Jean-Philippe Brucker >>> Cc: Will Deacon >>> Cc: Robin Murphy >>> Cc: Joerg Roedel >>> Signed-off-by: Rob Herring >>> --- >>> drivers/iommu/arm-smmu-v3.c | 53 +++++++++++++++++++++++++++++++++++++ >>> 1 file changed, 53 insertions(+) >>> >>> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c >>> index e91b4a098215..8b6b3e2aa383 100644 >>> --- a/drivers/iommu/arm-smmu-v3.c >>> +++ b/drivers/iommu/arm-smmu-v3.c >>> @@ -70,6 +70,9 @@ >>> #define IDR1_SSIDSIZE GENMASK(10, 6) >>> #define IDR1_SIDSIZE GENMASK(5, 0) >>> >>> +#define ARM_SMMU_IDR3 0xc >>> +#define IDR3_RIL (1 << 10) >>> + >>> #define ARM_SMMU_IDR5 0x14 >>> #define IDR5_STALL_MAX GENMASK(31, 16) >>> #define IDR5_GRAN64K (1 << 6) >>> @@ -327,9 +330,14 @@ >>> #define CMDQ_CFGI_1_LEAF (1UL << 0) >>> #define CMDQ_CFGI_1_RANGE GENMASK_ULL(4, 0) >>> >>> +#define CMDQ_TLBI_0_NUM GENMASK_ULL(16, 12) >>> +#define CMDQ_TLBI_RANGE_NUM_MAX 32 >>> +#define CMDQ_TLBI_0_SCALE GENMASK_ULL(24, 20) >>> #define CMDQ_TLBI_0_VMID GENMASK_ULL(47, 32) >>> #define CMDQ_TLBI_0_ASID GENMASK_ULL(63, 48) >>> #define CMDQ_TLBI_1_LEAF (1UL << 0) >>> +#define CMDQ_TLBI_1_TTL GENMASK_ULL(9, 8) >>> +#define CMDQ_TLBI_1_TG GENMASK_ULL(11, 10) >>> #define CMDQ_TLBI_1_VA_MASK GENMASK_ULL(63, 12) >>> #define CMDQ_TLBI_1_IPA_MASK GENMASK_ULL(51, 12) >>> >>> @@ -455,9 +463,13 @@ struct arm_smmu_cmdq_ent { >>> #define CMDQ_OP_TLBI_S2_IPA 0x2a >>> #define CMDQ_OP_TLBI_NSNH_ALL 0x30 >>> struct { >>> + u8 num; >>> + u8 scale; >>> u16 asid; >>> u16 vmid; >>> bool leaf; >>> + u8 ttl; >>> + u8 tg; >>> u64 addr; >>> } tlbi; >>> >>> @@ -595,6 +607,7 @@ struct arm_smmu_device { >>> #define ARM_SMMU_FEAT_HYP (1 << 12) >>> #define ARM_SMMU_FEAT_STALL_FORCE (1 << 13) >>> #define ARM_SMMU_FEAT_VAX (1 << 14) >>> +#define ARM_SMMU_FEAT_RANGE_INV (1 << 15) >>> u32 features; >>> >>> #define ARM_SMMU_OPT_SKIP_PREFETCH (1 << 0) >>> @@ -856,13 +869,21 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent) >>> cmd[1] |= FIELD_PREP(CMDQ_CFGI_1_RANGE, 31); >>> break; >>> case CMDQ_OP_TLBI_NH_VA: >>> + cmd[0] |= FIELD_PREP(CMDQ_TLBI_0_NUM, ent->tlbi.num); >>> + cmd[0] |= FIELD_PREP(CMDQ_TLBI_0_SCALE, ent->tlbi.scale); >>> cmd[0] |= FIELD_PREP(CMDQ_TLBI_0_ASID, ent->tlbi.asid); >>> cmd[1] |= FIELD_PREP(CMDQ_TLBI_1_LEAF, ent->tlbi.leaf); >>> + cmd[1] |= FIELD_PREP(CMDQ_TLBI_1_TTL, ent->tlbi.ttl); >>> + cmd[1] |= FIELD_PREP(CMDQ_TLBI_1_TG, ent->tlbi.tg); >>> cmd[1] |= ent->tlbi.addr & CMDQ_TLBI_1_VA_MASK; >>> break; >>> case CMDQ_OP_TLBI_S2_IPA: >>> + cmd[0] |= FIELD_PREP(CMDQ_TLBI_0_NUM, ent->tlbi.num); >>> + cmd[0] |= FIELD_PREP(CMDQ_TLBI_0_SCALE, ent->tlbi.scale); >>> cmd[0] |= FIELD_PREP(CMDQ_TLBI_0_VMID, ent->tlbi.vmid); >>> cmd[1] |= FIELD_PREP(CMDQ_TLBI_1_LEAF, ent->tlbi.leaf); >>> + cmd[1] |= FIELD_PREP(CMDQ_TLBI_1_TTL, ent->tlbi.ttl); >>> + cmd[1] |= FIELD_PREP(CMDQ_TLBI_1_TG, ent->tlbi.tg); >>> cmd[1] |= ent->tlbi.addr & CMDQ_TLBI_1_IPA_MASK; >>> break; >>> case CMDQ_OP_TLBI_NH_ASID: >>> @@ -2022,12 +2043,39 @@ static void arm_smmu_tlb_inv_range(unsigned long iova, size_t size, >>> cmd.tlbi.vmid = smmu_domain->s2_cfg.vmid; >>> } >>> >>> + if (smmu->features & ARM_SMMU_FEAT_RANGE_INV) { >>> + unsigned long tg, scale; >>> + >>> + /* Get the leaf page size */ >>> + tg = __ffs(smmu_domain->domain.pgsize_bitmap); >> it is unclear to me why you can't set tg with the granule parameter. > > granule could be 2MB sections if THP is enabled, right? Ah OK I thought it was a page size and not a block size. I requested this feature a long time ago for virtual SMMUv3. With DPDK/VFIO the guest was sending page TLB invalidation for each page (granule=4K or 64K) part of the hugepage buffer and those were trapped by the VMM. This stalled qemu. > >>> + >>> + /* Determine the power of 2 multiple number of pages */ >>> + scale = __ffs(size / (1UL << tg)); >>> + cmd.tlbi.scale = scale; >>> + >>> + cmd.tlbi.num = CMDQ_TLBI_RANGE_NUM_MAX - 1; >> Also could you explain why you use CMDQ_TLBI_RANGE_NUM_MAX. > > How's this: > /* The invalidation loop defaults to the maximum range */ I would have expected num=0 directly. Don't we invalidate the &size in one shot as 2^scale * pages of granularity @tg? I fail to understand when NUM > 0. Thanks Eric > > And perhaps I'll move it next to setting granule. > >>> + >>> + /* Convert page size of 12,14,16 (log2) to 1,2,3 */ >>> + cmd.tlbi.tg = ((tg - ilog2(SZ_4K)) / 2) + 1; >>> + >>> + /* Determine what level the granule is at */ >>> + cmd.tlbi.ttl = 4 - ((ilog2(granule) - 3) / (tg - 3)); >>> + >>> + /* Adjust granule to the maximum range */ >>> + granule = CMDQ_TLBI_RANGE_NUM_MAX * (1 << scale) * (1UL << tg); >> spec says >> Range = ((NUM+1)*2 ^ SCALE )*Translation_Granule_Size > > (NUM+1) can be 1-32. I went with the logical max for > CMDQ_TLBI_RANGE_NUM_MAX rather than the NUM field value max. > > Rob > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel > _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu