From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AE71FC6377D for ; Thu, 22 Jul 2021 15:46:23 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 1AA656128A for ; Thu, 22 Jul 2021 15:46:23 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1AA656128A Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 8E0446B0072; Thu, 22 Jul 2021 11:46:22 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 88F7B6B0073; Thu, 22 Jul 2021 11:46:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 759326B0074; Thu, 22 Jul 2021 11:46:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0161.hostedemail.com [216.40.44.161]) by kanga.kvack.org (Postfix) with ESMTP id 5924E6B0072 for ; Thu, 22 Jul 2021 11:46:22 -0400 (EDT) Received: from smtpin27.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id EFF1F2194A for ; Thu, 22 Jul 2021 15:46:21 +0000 (UTC) X-FDA: 78390650562.27.8258F83 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf22.hostedemail.com (Postfix) with ESMTP id A7B711983 for ; Thu, 22 Jul 2021 15:46:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1626968781; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/cmXFlDlqSmkzfOyxNp0urDFPKs8U6XmmiWK4vem5uw=; b=fF+Uq6irOs2XW9H7MRTW6joXfTihgubTsuYN5/t7AzqS4BwBj1VSE4lJrWNd6rsY6rKwSp BjkVm1CCUfh++g5dFJi4hFze7mZynwKMYuBAHWTQLynUaXgkQ7SCV1o/BpRQsbiaYkJMc6 X3D8/GoKs5T5TWRPfIYUuIkRnxLdPrU= Received: from mail-wm1-f71.google.com (mail-wm1-f71.google.com [209.85.128.71]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-94-lBlqtm9hOMSbYUBZeqwkng-1; Thu, 22 Jul 2021 11:46:17 -0400 X-MC-Unique: lBlqtm9hOMSbYUBZeqwkng-1 Received: by mail-wm1-f71.google.com with SMTP id l6-20020a05600c1d06b0290225338d8f53so1603910wms.8 for ; Thu, 22 Jul 2021 08:46:17 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:to:cc:references:from:organization:subject :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=/cmXFlDlqSmkzfOyxNp0urDFPKs8U6XmmiWK4vem5uw=; b=AQIYvdOrv975CD9ATCtSqputdKVBw9CU5QG156ddP7Fx1/ljr0E2AdZ08QjSUQDRsd Q9BjcTJbIwuMs6j1v3h5vTu2aatHhuMC1PHiTEQ8bJotx2blBe5BW/n1tFfel86/vBik Kc4/IfDF2p17t6l1JBO5i7QWQTbNfWkCDq76/ARP60OAr3v3F7sS1z6tXhvkvQbn7X5S HIliHE80v0X4qwt7VuHxHOCe/fVtirvLQmRHTfEWPL3Wnq7t4yuUsk9T13ki0/RxJ+gd hcM4vzG7L6YHsGuXRMlUEf9ctlLJOA3qlsaWdlmYaME9ud4xQF2yB+OCzTrJEWkKttu1 3jIw== X-Gm-Message-State: AOAM5329IXpAAr54of1Wr1EdYII1H88nPVgufhH6Kq0lZxdnZSbd2zkj KozaBVpVFhaZAgizaBkcd2QTT/Fjm3ET7B9gDjYmBmI7BqtBZwc7zX+yPgOCcOlr4nb0u3fsO5I sH2CY3IaLppo= X-Received: by 2002:a05:600c:2204:: with SMTP id z4mr9757427wml.169.1626968776279; Thu, 22 Jul 2021 08:46:16 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxttz4hBNMk/EDY1HbYQQraoKY211HWUJYBfbPJ8AoMza6fYaw1JmLlYcy1Ak66QiSeXBfNZA== X-Received: by 2002:a05:600c:2204:: with SMTP id z4mr9757395wml.169.1626968776049; Thu, 22 Jul 2021 08:46:16 -0700 (PDT) Received: from [192.168.3.132] (p5b0c6970.dip0.t-ipconnect.de. [91.12.105.112]) by smtp.gmail.com with ESMTPSA id f2sm30154717wrq.69.2021.07.22.08.46.14 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 22 Jul 2021 08:46:15 -0700 (PDT) To: "Kirill A. Shutemov" , Joerg Roedel Cc: David Rientjes , Borislav Petkov , Andy Lutomirski , Sean Christopherson , Andrew Morton , Vlastimil Babka , "Kirill A. Shutemov" , Andi Kleen , Brijesh Singh , Tom Lendacky , Jon Grimm , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , "Kaplan, David" , Varad Gautam , Dario Faggioli , x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev References: <20210720173004.ucrliup5o7l3jfq3@box.shutemov.name> From: David Hildenbrand Organization: Red Hat Subject: Re: Runtime Memory Validation in Intel-TDX and AMD-SNP Message-ID: Date: Thu, 22 Jul 2021 17:46:13 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: <20210720173004.ucrliup5o7l3jfq3@box.shutemov.name> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=fF+Uq6ir; spf=none (imf22.hostedemail.com: domain of david@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: A7B711983 X-Stat-Signature: ep79fs9g3t5gcss3jp5t35p1a6t7infm X-HE-Tag: 1626968781-787553 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: >> >> 8. When memory is returned to the memblock or page allocators, >> it is _not_ invalidated. In fact, all memory which is freed >> need to be valid. If it was marked invalid in the meantime >> (e.g. if it the memory was used for DMA buffers), the code >> owning the memory needs to validate it again before freeing >> it. >> >> The benefit of doing memory validation at allocation time is >> that it keeps the exception handler for invalid memory >> simple, because no exceptions of this kind are expected under >> normal operation. >=20 > During early boot I treat unaccepted memory as a usable RAM. It only > requires special treatment on memblock_reserve(), which used for early > memory allocation: unaccepted usable RAM has to be accepted, before > reserving. >=20 > For fine-grained accepting/validation tracking I use PageOffline() flag= s > (it's encoded into mapcount): before adding an unaccepted page to free > list I set the PageOffline() to indicate that the page has to be accept= ed > before returning from the page allocator. Currently, we never have > PageOffline() set for pages on free lists, so we won't have confusion w= ith > ballooning or memory hotplug. I was just about to propose something similar. Something like that=20 sounds like the best approach to me 1. Sync e820 to memblock 2. Sync memblock to memmap 3. Let the page allocator deal with validation once initializing/handing=20 out memory PageOffline() does exactly what you want, just be aware that=20 PageBuddy()+PageOffline() won't be recognized by crash anymore, as it=20 tests for a single memmap value. Can be fixed with makedumpfile updates=20 once that applies. Alternatively, you could use any other page flag that is yet unsued=20 combined with PageBuddy. Sure, there might be obstacles, but it certainly sounds like a clean=20 approach to me. >=20 > I try to keep pages accepted in 2M or 4M chunks (pageblock_order or > MAX_ORDER). It is reasonable compromise on speed/latency. >=20 > I still debugging the code, but hopefully will get working PoC this wee= k. >=20 [...] >=20 > I'm not sure a bitmap is needed. I hope we can use E820 for early > tracking. But let's see if it works. +1, this smells like an anti-patter. I'm absolutely not in favor of a=20 bitmap, we have the sparse memory model for a reason. Also, I am not convinced that kexec/kdump is actually easy to realize=20 with the bitmap? Who will forward that bitmap? Where will it reside? Who=20 says it's not corrupted? Just take a look at how we don't even have=20 access to memmap of the oldkernel in the newkernel -- and have to locate=20 and decipher it in constantly-to-be-updated user space makedumpfile. Any=20 time you'd change anything about the bitmap ("hey, let's use larger=20 chunks", "hey, let's split it up") you'd break the old_kernel <->=20 new_kernel agreement. --=20 Thanks, David / dhildenb