From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.3 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 64899C433ED for ; Mon, 19 Apr 2021 18:12:13 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id E93B8610CC for ; Mon, 19 Apr 2021 18:12:12 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E93B8610CC Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 7857B6B0036; Mon, 19 Apr 2021 14:12:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 75CF46B006E; Mon, 19 Apr 2021 14:12:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5D7EE6B0070; Mon, 19 Apr 2021 14:12:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0206.hostedemail.com [216.40.44.206]) by kanga.kvack.org (Postfix) with ESMTP id 42C416B0036 for ; Mon, 19 Apr 2021 14:12:12 -0400 (EDT) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id EE447181AEF32 for ; Mon, 19 Apr 2021 18:12:11 +0000 (UTC) X-FDA: 78049910862.01.A0C8FCD Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf23.hostedemail.com (Postfix) with ESMTP id 956A6A0003A2 for ; Mon, 19 Apr 2021 18:12:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1618855930; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Yt1B5gSPcHjxLjaX0l94gOJJ7HZzNFqMG6taUUcuBkE=; b=Zy3ehHkDANwXaw9wDLDZl85poLH6IjT4MONEwK0h3WTXWLZ0my6ZVBd5/SloAAdZffAtuJ McpuK1DAqRGSA8hWp3tZ6+gvOfzORXzqHqSqbMwi/bpY97k1KKsvlORPVw5YRyYGWg3/+F rke6QipFvLelnNZXOGMpqib++O0Xf40= Received: from mail-wm1-f70.google.com (mail-wm1-f70.google.com [209.85.128.70]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-323-GZ_djp4_PbyEIQYkyz_0YA-1; Mon, 19 Apr 2021 14:12:09 -0400 X-MC-Unique: GZ_djp4_PbyEIQYkyz_0YA-1 Received: by mail-wm1-f70.google.com with SMTP id y82-20020a1ce1550000b02901262158f1e9so5135511wmg.8 for ; Mon, 19 Apr 2021 11:12:09 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:organization :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=Yt1B5gSPcHjxLjaX0l94gOJJ7HZzNFqMG6taUUcuBkE=; b=Bw+fxITM5tczh1h79DxiBaCX3y5DigMdV3Efs6pwRSD0jEG/Vqh43jL2yDeZIh6xTf e+9VgTz8+9cy2GR+6jUTx7rMr7JgIjixZpEQcj91gNUB/sCu3qGzyS5lImipGdY0FDD6 XilGPZcQgIJ8kO1q0qwglWgJj9yDITzvmT8RIMyYR306wB7oH/Qgo9NpahOFkUGad5Lf 6CQ8oxjED+jMv1h3+qUz/9kLGuA/S/GPsC4bwDCBtjxoL25FOS5br89v7CoaqEtNUmup p20Ik36vupGv6RHJ5NIJ81/F2E9JnXMRJSzH1121qsoj/sV3S5g88gXTT/b31cv0ghL6 SAcA== X-Gm-Message-State: AOAM533x5zO4H21EWlPvnP9npVkVf3Bp1oWTfaUj/agJPBzHCkZDxQFw qk9Ebltc3Ji/J9HW5oleBNr03wF2walnKi+Q6qOJ19dRaPUXnj1vrYtgx4/SihQ0MNYxb1x6NdZ h8GGkr5dne20= X-Received: by 2002:adf:9245:: with SMTP id 63mr15301836wrj.324.1618855928160; Mon, 19 Apr 2021 11:12:08 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzgBQzBqG9QV8DcunPqTZ92xpfrbS4rOlQQiRMnmFrOvdw8w4N7fXlqNhbvW/pblVoGYyrlUQ== X-Received: by 2002:adf:9245:: with SMTP id 63mr15301787wrj.324.1618855927818; Mon, 19 Apr 2021 11:12:07 -0700 (PDT) Received: from [192.168.3.132] (p5b0c69b8.dip0.t-ipconnect.de. [91.12.105.184]) by smtp.gmail.com with ESMTPSA id i15sm22513508wrr.73.2021.04.19.11.12.06 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 19 Apr 2021 11:12:07 -0700 (PDT) Subject: Re: [RFCv2 13/13] KVM: unmap guest memory using poisoned pages To: Sean Christopherson , "Kirill A. Shutemov" Cc: "Kirill A. Shutemov" , Dave Hansen , Andy Lutomirski , Peter Zijlstra , Jim Mattson , David Rientjes , "Edgecombe, Rick P" , "Kleen, Andi" , "Yamahata, Isaku" , Erdem Aktas , Steve Rutherford , Peter Gonda , x86@kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <20210416154106.23721-1-kirill.shutemov@linux.intel.com> <20210416154106.23721-14-kirill.shutemov@linux.intel.com> <20210419142602.khjbzktk5tk5l6lk@box.shutemov.name> <20210419164027.dqiptkebhdt5cfmy@box.shutemov.name> From: David Hildenbrand Organization: Red Hat Message-ID: Date: Mon, 19 Apr 2021 20:12:06 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.8.1 MIME-Version: 1.0 In-Reply-To: Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=david@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 956A6A0003A2 X-Stat-Signature: b57sh18sktb9g4gi9sz4si536iejd19s Received-SPF: none (redhat.com>: No applicable sender policy available) receiver=imf23; identity=mailfrom; envelope-from=""; helo=us-smtp-delivery-124.mimecast.com; client-ip=216.205.24.124 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1618855929-629675 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 19.04.21 20:09, Sean Christopherson wrote: > On Mon, Apr 19, 2021, Kirill A. Shutemov wrote: >> On Mon, Apr 19, 2021 at 04:01:46PM +0000, Sean Christopherson wrote: >>> But fundamentally the private pages, are well, private. They can't be shared >>> across processes, so I think we could (should?) require the VMA to always be >>> MAP_PRIVATE. Does that buy us enough to rely on the VMA alone? I.e. is that >>> enough to prevent userspace and unaware kernel code from acquiring a reference >>> to the underlying page? >> >> Shared pages should be fine too (you folks wanted tmpfs support). > > Is that a conflict though? If the private->shared conversion request is kicked > out to userspace, then userspace can re-mmap() the files as MAP_SHARED, no? > > Allowing MAP_SHARED for guest private memory feels wrong. The data can't be > shared, and dirty data can't be written back to the file. > >> The poisoned pages must be useless outside of the process with the blessed >> struct kvm. See kvm_pfn_map in the patch. > > The big requirement for kernel TDX support is that the pages are useless in the > host. Regarding the guest, for TDX, the TDX Module guarantees that at most a > single KVM guest can have access to a page at any given time. I believe the RMP > provides the same guarantees for SEV-SNP. > > SEV/SEV-ES could still end up with corruption if multiple guests map the same > private page, but that's obviously not the end of the world since it's the status > quo today. Living with that shortcoming might be a worthy tradeoff if punting > mutual exclusion between guests to firmware/hardware allows us to simplify the > kernel implementation. > >>>> - Add a new GUP flag to retrive such pages from the userspace mapping. >>>> Used only for private mapping population. >>> >>>> - Shared gfn ranges managed by userspace, based on hypercalls from the >>>> guest. >>>> >>>> - Shared mappings get populated via normal VMA. Any poisoned pages here >>>> would lead to SIGBUS. >>>> >>>> So far it looks pretty straight-forward. >>>> >>>> The only thing that I don't understand is at way point the page gets tied >>>> to the KVM instance. Currently we do it just before populating shadow >>>> entries, but it would not work with the new scheme: as we poison pages >>>> on fault it they may never get inserted into shadow entries. That's not >>>> good as we rely on the info to unpoison page on free. >>> >>> Can you elaborate on what you mean by "unpoison"? If the page is never actually >>> mapped into the guest, then its poisoned status is nothing more than a software >>> flag, i.e. nothing extra needs to be done on free. >> >> Normally, poisoned flag preserved for freed pages as it usually indicate >> hardware issue. In this case we need return page to the normal circulation. >> So we need a way to differentiate two kinds of page poison. Current patch >> does this by adding page's pfn to kvm_pfn_map. But this will not work if >> we uncouple poisoning and adding to shadow PTE. > > Why use PG_hwpoison then? > I already raised that reusing PG_hwpoison is not what we want. And I repeat, to me this all looks like a big hack; some things you (Sena) propose sound cleaner, at least to me. -- Thanks, David / dhildenb