Re: [PATCH 2/4] swiotlb: Add a new cc-swiotlb implementation for Confidential VMs

linux-coco.lists.linux.dev archive mirror
 help / color / mirror / Atom feed

From: Guorui Yu <GuoRui.Yu@linux.alibaba.com>
To: Andi Kleen <ak@linux.intel.com>,
	linux-kernel@vger.kernel.org, iommu@lists.linux-foundation.org,
	konrad.wilk@oracle.com, linux-coco@lists.linux.dev
Cc: robin.murphy@arm.com
Subject: Re: [PATCH 2/4] swiotlb: Add a new cc-swiotlb implementation for Confidential VMs
Date: Mon, 30 Jan 2023 10:25:06 +0800	[thread overview]
Message-ID: <2ec59355-c8d5-c794-16e8-7d646b43c455@linux.alibaba.com> (raw)
In-Reply-To: <9b167caf-1b10-f97a-d96a-b7ead8e785e8@linux.intel.com>

在 2023/1/30 00:58, Andi Kleen 写道:
> 
> On 1/28/2023 12:32 AM, GuoRui.Yu wrote:
>> Under COnfidential COmputing (CoCo) scenarios, the VMM cannot access
>> guest memory directly but requires the guest to explicitly mark the
>> memory as shared (decrypted). To make the streaming DMA mappings work,
>> the current implementation relays on legacy SWIOTLB to bounce the DMA
>> buffer between private (encrypted) and shared (decrypted) memory.
>>
>> However, the legacy swiotlb is designed for compatibility rather than
>> efficiency and CoCo purpose, which will inevitably introduce some
>> unnecessary restrictions.
>> 1. Fixed immutable swiotlb size cannot accommodate to requirements of
>>     multiple devices. And 1GiB (current maximum size) of swiotlb in our
>>     testbed cannot afford multiple disks reads/writes simultaneously.
>> 2. Fixed immutable IO_TLB_SIZE (2KiB) cannot satisfy various kinds of
>>     devices. At the moment, the minimal size of a swiotlb buffer is 2KiB,
>>     which will waste memory on small network packets (under 512 bytes)
>>     and decrease efficiency on a large block (up to 256KiB) size
>>     reads/writes of disks. And it is hard to have a trade-off on legacy
>>     swiotlb to rule them all.
>> 3. The legacy swiotlb cannot efficiently support larger swiotlb buffers.
>>     In the worst case, the current implementation requires a full scan of
>>     the entire swiotlb buffer, which can cause severe performance hits.
>>
>> Instead of keeping "infecting" the legacy swiotlb code with CoCo logic,
>> this patch tries to introduce a new cc-swiotlb for Confidential VMs.
>>
>> Confidential VMs usually have reasonable modern devices (virtio devices,
>> NVME, etc.), which can access memory above 4GiB, cc-swiotlb could
>> allocate TLB buffers dynamically on-demand, and this design solves
>> problem 1.
> 
> When you say solving you mean support for growing the size dynamically 
> without pre-allocation?
> 
> The IOMMU is traditionally called in non preemptible regions in drivers, 
> and also allocating memory in IO paths is still not considered fully 
> safe due to potential deadlocks. Both makes it difficult to allocate 
> large memory regions dynamically.
> 
> It's not clear how you would solve that?
>
> -Andi

Hi Andi,

Thanks for your question!

I try to solve this problem by creating a new kernel thread, "kccd", to 
populate the TLB buffer in the backgroud.

Specifically,
1. A new kernel thread is created with the help of "arch_initcall", and 
this kthread is responsible for memory allocation and setting memory 
attributes (private or shared);
2. The "swiotlb_tbl_map_single" routine only use the spin_lock protected 
TLB buffers pre-allocated by the kthread;
   a) which actually includes ONE memory allocation brought by xarray 
insertion "__xa_insert__".
3. After each allocation, the water level of TLB resources will be 
checked. If the current TLB resources are found to be lower than the 
preset value (half of the watermark), the kthread will be awakened to 
fill them.
4. The TLB buffer allocation in the kthread is batched to 
"(MAX_ORDER_NR_PAGES << PAGE_SHIFT)" to reduce the holding time of 
spin_lock and number of calls to set_memory_decrypted().

Thanks,
Guorui

next prev parent reply	other threads:[~2023-01-30  2:30 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-28  8:32 [RFC] swiotlb: Add a new cc-swiotlb implementation for Confidential VMs GuoRui.Yu
2023-01-28  8:32 ` [PATCH 1/4] swiotlb: Split common code from swiotlb.{c,h} GuoRui.Yu
2023-01-28  8:32 ` [PATCH 2/4] swiotlb: Add a new cc-swiotlb implementation for Confidential VMs GuoRui.Yu
2023-01-28 12:03   ` kernel test robot
2023-01-28 16:41   ` Randy Dunlap
2023-01-29  1:54     ` Guorui Yu
2023-01-29 16:58   ` Andi Kleen
2023-01-30  2:25     ` Guorui Yu [this message]
2023-01-30  6:46       ` Andi Kleen
2023-01-30 13:45         ` Guorui Yu
2023-01-31 17:16           ` Andi Kleen
2023-02-01  2:08             ` Guorui Yu
2023-01-28  8:32 ` [PATCH 3/4] swiotlb: Add tracepoint swiotlb_unbounced GuoRui.Yu
2023-01-28  8:32 ` [PATCH 4/4] cc-swiotlb: Allow set swiotlb watermark from cmdline GuoRui.Yu
2023-01-28 20:19   ` kernel test robot
2023-01-28  9:03 ` [RFC] swiotlb: Add a new cc-swiotlb implementation for Confidential VMs Guorui Yu
2023-01-30  6:54   ` Christoph Hellwig
2023-01-30 13:03 ` Robin Murphy
2023-01-30 14:37   ` Guorui Yu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2ec59355-c8d5-c794-16e8-7d646b43c455@linux.alibaba.com \
    --to=guorui.yu@linux.alibaba.com \
    --cc=ak@linux.intel.com \
    --cc=iommu@lists.linux-foundation.org \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-coco@lists.linux.dev \
    --cc=linux-kernel@vger.kernel.org \
    --cc=robin.murphy@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).