From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 479A57FC for ; Mon, 18 Jul 2022 01:28:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1658107714; x=1689643714; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=iSzfrdEqebPw4+c2ygjk35cjtYfG15dYAOW7Bi0/ihU=; b=m/MUC18u/DHVZfmTL2apwpjcGKHnaZXQlBWiq08iW/LhKtL33sTZavZW BlnK3IA7Y2XgKNaqzfJsZBUCNEqTpfrHrYlx9gvd8mjkICt90TBArAbhf khwE4nctI6/wTAAazo1PgeNLeprs90LCuHEX+wfL6mg6mnHq0EWuQa7AC V1T0aDZA/hCFhUrIxrXtwOsT+e8TBZK+Tq4NFYwtvXYGfrIQUPytalPlx 4mbG1lR7EcUsvBNzVYeqNoQ5YRHSqZhHJFTwB6O7xubPKwj9bVFX/TveE MxYw/e7aUTXfye2wSnSISlnZ3fEUPQNoOKBMRCuFNJn75rDsKZ4yje9y1 w==; X-IronPort-AV: E=McAfee;i="6400,9594,10411"; a="266513493" X-IronPort-AV: E=Sophos;i="5.92,280,1650956400"; d="scan'208";a="266513493" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Jul 2022 18:28:33 -0700 X-IronPort-AV: E=Sophos;i="5.92,280,1650956400"; d="scan'208";a="547294019" Received: from spr.sh.intel.com ([10.239.53.122]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Jul 2022 18:28:30 -0700 From: Chao Gao To: linux-kernel@vger.kernel.org, iommu@lists.linux.dev Cc: dave.hansen@intel.com, len.brown@intel.com, tony.luck@intel.com, rafael.j.wysocki@intel.com, reinette.chatre@intel.com, dan.j.williams@intel.com, kirill.shutemov@linux.intel.com, sathyanarayanan.kuppuswamy@linux.intel.com, ilpo.jarvinen@linux.intel.com, ak@linux.intel.com, alexander.shishkin@linux.intel.com, Chao Gao Subject: [RFC v2 0/2] swiotlb performance optimizations Date: Mon, 18 Jul 2022 09:28:16 +0800 Message-Id: <20220718012818.107051-1-chao.gao@intel.com> X-Mailer: git-send-email 2.25.1 Precedence: bulk X-Mailing-List: iommu@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Intent of this post: Seek reviews from Intel reviewers and anyone else in the list interested in IO performance in confidential VMs. Need some acked-by reviewed-by tags before I can add swiotlb maintainers to "to/cc" lists and ask for a review from them. Changes from v1 to v2: - rebase to the latest dma-mapping tree. - drop the duplicate patch for mitigating lock contention - re-collect perf data swiotlb is now widely used by confidential VMs. This series optimizes swiotlb to reduce cache misses and lock contention during bounce buffer allocation/free and memory bouncing to improve IO workload performance in confidential VMs. Here are some FIO tests we did to demonstrate the improvement. Test setup ---------- A normal VM with 8vCPU and 32G memory, swiotlb is enabled by swiotlb=force. FIO block size is 4K and iodepth is 256. Note that a normal VM is used so that others lack of necessary hardware to host confidential VMs can reproduce results below. Results ------- 1 FIO job read/write IOPS (k) vanilla read 216 write 251 optimized read 250 write 270 1-job FIO sequential read/write perf increase by 19% and 8% respectively. Chao Gao (2): swiotlb: use bitmap to track free slots swiotlb: Allocate memory in a cache-friendly way include/linux/swiotlb.h | 8 ++- kernel/dma/swiotlb.c | 127 +++++++++++++++++----------------------- 2 files changed, 60 insertions(+), 75 deletions(-) -- 2.25.1