From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.9 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0615AC43462 for ; Wed, 5 May 2021 15:10:43 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D0138613CD for ; Wed, 5 May 2021 15:10:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233332AbhEEPLi (ORCPT ); Wed, 5 May 2021 11:11:38 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:59283 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233321AbhEEPLg (ORCPT ); Wed, 5 May 2021 11:11:36 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1620227439; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/L/odP+hNbqQUDDWOlbej9giUfvPDpuerxl0425+TBc=; b=eFK8hWqljmUg8TbqRQN2nuiiNGqfYTiONdbhYgrHfQzlYNzHw7cSc8+trK8cwqSh4iL5Vq EZHyrAWJA1483zTFbnfybUPB6Wsg9TGHU4Sj6n3g+SrmiQUB4NZS2FCZ/m7dfF4DyTNtc3 RVF5IfBgt6eF3kbbxNJG3cVa/4NZhuM= Received: from mail-wm1-f70.google.com (mail-wm1-f70.google.com [209.85.128.70]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-85-99x7POcVPm2UtJLlX3tiEQ-1; Wed, 05 May 2021 11:10:36 -0400 X-MC-Unique: 99x7POcVPm2UtJLlX3tiEQ-1 Received: by mail-wm1-f70.google.com with SMTP id g17-20020a05600c0011b029014399f816a3so432371wmc.7 for ; Wed, 05 May 2021 08:10:36 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:organization :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=/L/odP+hNbqQUDDWOlbej9giUfvPDpuerxl0425+TBc=; b=NE0FNZJnfVxY3hWQEh9sFE12cBvF+7bXjsZbuzS+jg6D5s9WSgZdI0RGupRJKGRAwE HON+sJN0RUxy0VtHFA08jAxP+uxMuy15YTtyyPfFAMx8le+T8dl7tUNc1n2QjHejIY3P VNnRAsQm1J6W7SU/9izp4E5pzR+TFE28tCpC0/1BdsTlMlwr3hX6ceCRWHZOMLtPFpNA QRpPmEQVsYe+9UoH5bx8CmNM4te133FOzTT0HHUQh33B9EDjwiDiV6OC8s3F8EWxwb96 aj+YHjy/4ukp7KxvhNKqWHs2zOp1K3KHiPyTIxuep3+qBhVsRhkOEkckBES46A+h5WWY 6IWA== X-Gm-Message-State: AOAM531rGboxb6qUkouVKhgKRWaK3M6ekNOEFc7TT4bdYSsueBGgfRdO Yujt+napWkX402oLyAYrWwNZWJVacdLuZwaLuqtTUKBhMiUcs4v57Z6EsbW2oH7jVhXZPpaFPqt LOGJGOgBqutM8dLEXsx6YbbbsBg== X-Received: by 2002:adf:e686:: with SMTP id r6mr38035619wrm.187.1620227435728; Wed, 05 May 2021 08:10:35 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzzj10ka75uUq1iMVB73dSK1NoxnunzZR7BfPe0BRAG27lacrjzLCxVkjF8du5Ejfe2oe3wnw== X-Received: by 2002:adf:e686:: with SMTP id r6mr38035569wrm.187.1620227435461; Wed, 05 May 2021 08:10:35 -0700 (PDT) Received: from [192.168.3.132] (p5b0c63bc.dip0.t-ipconnect.de. [91.12.99.188]) by smtp.gmail.com with ESMTPSA id m184sm6099684wme.40.2021.05.05.08.10.34 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 05 May 2021 08:10:35 -0700 (PDT) Subject: Re: [PATCH v1 5/7] mm: introduce page_offline_(begin|end|freeze|unfreeze) to synchronize setting PageOffline() To: Michal Hocko Cc: linux-kernel@vger.kernel.org, Andrew Morton , "Michael S. Tsirkin" , Jason Wang , Alexey Dobriyan , Mike Rapoport , "Matthew Wilcox (Oracle)" , Oscar Salvador , Roman Gushchin , Alex Shi , Steven Price , Mike Kravetz , Aili Yao , Jiri Bohac , "K. Y. Srinivasan" , Haiyang Zhang , Stephen Hemminger , Wei Liu , Naoya Horiguchi , linux-hyperv@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org References: <20210429122519.15183-1-david@redhat.com> <20210429122519.15183-6-david@redhat.com> From: David Hildenbrand Organization: Red Hat Message-ID: <8650f764-8652-a82c-c54f-f67401c800e8@redhat.com> Date: Wed, 5 May 2021 17:10:33 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.8.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org On 05.05.21 15:24, Michal Hocko wrote: > On Thu 29-04-21 14:25:17, David Hildenbrand wrote: >> A driver might set a page logically offline -- PageOffline() -- and >> turn the page inaccessible in the hypervisor; after that, access to page >> content can be fatal. One example is virtio-mem; while unplugged memory >> -- marked as PageOffline() can currently be read in the hypervisor, this >> will no longer be the case in the future; for example, when having >> a virtio-mem device backed by huge pages in the hypervisor. >> >> Some special PFN walkers -- i.e., /proc/kcore -- read content of random >> pages after checking PageOffline(); however, these PFN walkers can race >> with drivers that set PageOffline(). >> >> Let's introduce page_offline_(begin|end|freeze|unfreeze) for >> synchronizing. >> >> page_offline_freeze()/page_offline_unfreeze() allows for a subsystem to >> synchronize with such drivers, achieving that a page cannot be set >> PageOffline() while frozen. >> >> page_offline_begin()/page_offline_end() is used by drivers that care about >> such races when setting a page PageOffline(). >> >> For simplicity, use a rwsem for now; neither drivers nor users are >> performance sensitive. > > Please add a note to the PageOffline documentation as well. While are > adding the api close enough an explicit note there wouldn't hurt. Will do. > >> Signed-off-by: David Hildenbrand > > As to the patch itself, I am slightly worried that other pfn walkers > might be less tolerant to the locking than the proc ones. On the other > hand most users shouldn't really care as they do not tend to touch the > memory content and PageOffline check without any synchronization should > be sufficient for those. Let's try this out and see where we get... My thinking. Users that actually read random page content (as discussed in the cover letter) are 1. Hibernation 2. Dumping (/proc/kcore, /proc/vmcore) 3. Physical memory access bypassing the kernel via /dev/mem 4. Live debug tools (kgdb) Other PFN walkers really shouldn't (and don't) access random page content. Thanks! -- Thanks, David / dhildenb