From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.21]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0A5433845CE for ; Thu, 2 Apr 2026 07:00:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.21 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775113236; cv=none; b=m9+e5qGRY/WwbluXc0d7QPm8CWii9CLuLKf3CI3Xlo1TeQinJZ2GKv36XYo6im7rVSNEKsJeczCva1Ra4znTvysuhdtGpiw5//8zeZCfhr1Bw2QmP9aRFxdG/3WwDYNkDqpXUyYLAVjUakgskGk8UPsLi/fV6YKF2vHlsekaJd0= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775113236; c=relaxed/simple; bh=Yu63ydkahPogUMvV2raVoIg5D9VM0YAdDSsvucDzLfI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=XDL+BaukcLkLuMOc9t9jhuKmyRP7sEbBXVfILZ81EagSNQWOGuosM+1ZWy+twbQmdTHdt2hRvS2a/h5EGOeKQye1tcda6gTq1hTE0waMOiJIc7BbwIVrc5ZewQMWYA6L4v+rflKpgASX3fE0TensZutRZdhxRGPQQ3U27XVrW48= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=pass smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=P3r73HJS; arc=none smtp.client-ip=198.175.65.21 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="P3r73HJS" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1775113233; x=1806649233; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Yu63ydkahPogUMvV2raVoIg5D9VM0YAdDSsvucDzLfI=; b=P3r73HJSCr1nzPA5m59NDgvu2FnJFjc2w8TFDB7EjVDP4SjsCHA+R14G tDKftDeZWyyfY7z1KI4bkbQhKvSt5+B9K7l+Erp6fHIkIDSIe0cV118uL jes6e8qTf3HkG6cgZVCCdMgEpNsm1LWZNKea4bPbG2i4qeQcjrvL/SL9v JDjyX9VyTuv2ohop5xlp//uHwGw08sUuy19Z7VGmdoZLIyQOXAopQ65i0 dmA0aS3wvf19z5D6u7zwvnTahyVnAde8fPxQD98WxUWhtQcWYekdlaFW9 E8e/mCzU/z/K8U9/NqDl66iOWWsEcMr5wu/TH+Mza3sadeYDx5WDh1gES w==; X-CSE-ConnectionGUID: D0KsfE71TVK8wuAxsxxXFg== X-CSE-MsgGUID: 0ZTf4KPPRGyPra8ZQLVFIw== X-IronPort-AV: E=McAfee;i="6800,10657,11746"; a="76053672" X-IronPort-AV: E=Sophos;i="6.23,155,1770624000"; d="scan'208";a="76053672" Received: from orviesa005.jf.intel.com ([10.64.159.145]) by orvoesa113.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Apr 2026 00:00:31 -0700 X-CSE-ConnectionGUID: eGiuhs1kQiaoaJrfZnII9Q== X-CSE-MsgGUID: TeFpf/a1TwadyTa3RcF5hQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.23,155,1770624000"; d="scan'208";a="231847921" Received: from allen-box.sh.intel.com ([10.239.159.52]) by orviesa005.jf.intel.com with ESMTP; 02 Apr 2026 00:00:30 -0700 From: Lu Baolu To: Joerg Roedel Cc: Zhenzhong Duan , Bjorn Helgaas , Jason Gunthorpe , iommu@lists.linux.dev, linux-kernel@vger.kernel.org Subject: [PATCH 10/10] iommu/vt-d: Simplify calculate_psi_aligned_address() Date: Thu, 2 Apr 2026 14:57:33 +0800 Message-ID: <20260402065734.1687476-11-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20260402065734.1687476-1-baolu.lu@linux.intel.com> References: <20260402065734.1687476-1-baolu.lu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit From: Jason Gunthorpe This is doing far too much math for the simple task of finding a power of 2 that fully spans the given range. Use fls directly on the xor which computes the common binary prefix. Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/4-v1-f175e27af136+11647-iommupt_inv_vtd_jgg@nvidia.com Signed-off-by: Lu Baolu --- drivers/iommu/intel/cache.c | 49 ++++++++++++------------------------- 1 file changed, 16 insertions(+), 33 deletions(-) diff --git a/drivers/iommu/intel/cache.c b/drivers/iommu/intel/cache.c index be8410f0e841..54dd9f7323bd 100644 --- a/drivers/iommu/intel/cache.c +++ b/drivers/iommu/intel/cache.c @@ -254,37 +254,25 @@ void cache_tag_unassign_domain(struct dmar_domain *domain, } static unsigned long calculate_psi_aligned_address(unsigned long start, - unsigned long end, - unsigned long *_mask) + unsigned long last, + unsigned long *size_order) { - unsigned long pages = aligned_nrpages(start, end - start + 1); - unsigned long aligned_pages = __roundup_pow_of_two(pages); - unsigned long bitmask = aligned_pages - 1; - unsigned long mask = ilog2(aligned_pages); - unsigned long pfn = IOVA_PFN(start); + unsigned int sz_lg2; - /* - * PSI masks the low order bits of the base address. If the - * address isn't aligned to the mask, then compute a mask value - * needed to ensure the target range is flushed. - */ - if (unlikely(bitmask & pfn)) { - unsigned long end_pfn = pfn + pages - 1, shared_bits; - - /* - * Since end_pfn <= pfn + bitmask, the only way bits - * higher than bitmask can differ in pfn and end_pfn is - * by carrying. This means after masking out bitmask, - * high bits starting with the first set bit in - * shared_bits are all equal in both pfn and end_pfn. - */ - shared_bits = ~(pfn ^ end_pfn) & ~bitmask; - mask = shared_bits ? __ffs(shared_bits) : MAX_AGAW_PFN_WIDTH; + /* Compute a sz_lg2 that spans start and last */ + start &= GENMASK(BITS_PER_LONG - 1, VTD_PAGE_SHIFT); + sz_lg2 = fls_long(start ^ last); + if (sz_lg2 <= 12) { + *size_order = 0; + return start; + } + if (unlikely(sz_lg2 >= MAX_AGAW_PFN_WIDTH)) { + *size_order = MAX_AGAW_PFN_WIDTH; + return 0; } - *_mask = mask; - - return ALIGN_DOWN(start, VTD_PAGE_SIZE << mask); + *size_order = sz_lg2 - VTD_PAGE_SHIFT; + return start & GENMASK(BITS_PER_LONG - 1, sz_lg2); } static void qi_batch_flush_descs(struct intel_iommu *iommu, struct qi_batch *batch) @@ -441,12 +429,7 @@ void cache_tag_flush_range(struct dmar_domain *domain, unsigned long start, struct cache_tag *tag; unsigned long flags; - if (start == 0 && end == ULONG_MAX) { - addr = 0; - mask = MAX_AGAW_PFN_WIDTH; - } else { - addr = calculate_psi_aligned_address(start, end, &mask); - } + addr = calculate_psi_aligned_address(start, end, &mask); spin_lock_irqsave(&domain->cache_lock, flags); list_for_each_entry(tag, &domain->cache_tags, node) { -- 2.43.0