From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8D397C433F5 for ; Mon, 7 Feb 2022 10:32:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 01CAF6B0074; Mon, 7 Feb 2022 05:32:53 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id F0E3C6B0075; Mon, 7 Feb 2022 05:32:52 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DD6966B0078; Mon, 7 Feb 2022 05:32:52 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0207.hostedemail.com [216.40.44.207]) by kanga.kvack.org (Postfix) with ESMTP id CCC266B0074 for ; Mon, 7 Feb 2022 05:32:52 -0500 (EST) Received: from smtpin10.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 8E784181E7276 for ; Mon, 7 Feb 2022 10:32:52 +0000 (UTC) X-FDA: 79115620584.10.8CC8A9E Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf16.hostedemail.com (Postfix) with ESMTP id 03461180005 for ; Mon, 7 Feb 2022 10:32:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644229971; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=8N7gZUufVPClDbzqByvGZXGwhEjMnnKtUKzAi7ogJj4=; b=axKKS0tNfsVVan3LR31QVIANXqiJaBwV+/FpMrey55LqjYe56O9iZPsELvqRqVHtaUpWma G8MWdyj1Ilwj6NIuSd2WM42MmPoS1B/kbwP7837+QBTRNNohdJCcZ4ia5EJ4gZ0BnYUJDd AjMJTe5sJvOsuYiBAVNjNb9yaJvCybU= Received: from mail-ej1-f71.google.com (mail-ej1-f71.google.com [209.85.218.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-580-32clYRjJN1-aRUumBh6jEg-1; Mon, 07 Feb 2022 05:32:50 -0500 X-MC-Unique: 32clYRjJN1-aRUumBh6jEg-1 Received: by mail-ej1-f71.google.com with SMTP id m4-20020a170906160400b006be3f85906eso4127802ejd.23 for ; Mon, 07 Feb 2022 02:32:50 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent :content-language:to:cc:references:from:organization:subject :in-reply-to:content-transfer-encoding; bh=8N7gZUufVPClDbzqByvGZXGwhEjMnnKtUKzAi7ogJj4=; b=7lhyrLnEgegeKceZ91k7o74dVOFX9AiP5aaTxvZ6W+WsJ2gijfbTjTzwhG93U0PdVz 0mER10A+nZ2EVK9zVZJi6xMr9+hUmu6i3DCmNBKBQIvxYSHrSxFw4jbKq1rZmyileLFg 3YuOfKlWBqNfA1LmA2VN3VbydVn8yMnG+Tmzmxf+vWb9C9bxBAGkUvBKtwgaoaM1iFdv uLuWKGngdpvV44KJgL3uv/y29T/scGGaNAgYjTPo9FXhXu96FvIKy3gec+AYbel3CMLr BxwziSYtwfFxt8bAOZF3djEJEPaxi7RsjRVF1PPZmyQLE4r2s+DbgKXqeRDTZjDLsjS6 TtdA== X-Gm-Message-State: AOAM533vuXXRnsq7vLAVFN1qCjkYjwbLsTliPakOS78cxmws7w7K7xvQ Al3iyL4jN6FzrWJRwQeURPJFt3joBJdNlUS4vrfXWggSu4KZtwDNGrLgPJRQgIpbG/fWkcFQNRS xB1e7Tz+w698= X-Received: by 2002:a17:907:94d2:: with SMTP id dn18mr9620392ejc.304.1644229968915; Mon, 07 Feb 2022 02:32:48 -0800 (PST) X-Google-Smtp-Source: ABdhPJyvKF96YRq6NwS53FrZhG1mLqCL5ivXYdlf5pr+exgpBaE1m8iBSzQAIw7i1vALdWUV19fCqQ== X-Received: by 2002:a17:907:94d2:: with SMTP id dn18mr9620367ejc.304.1644229968625; Mon, 07 Feb 2022 02:32:48 -0800 (PST) Received: from ?IPV6:2003:cb:c709:6300:a751:d742:1f76:8639? (p200300cbc7096300a751d7421f768639.dip0.t-ipconnect.de. [2003:cb:c709:6300:a751:d742:1f76:8639]) by smtp.gmail.com with ESMTPSA id ek21sm2838797edb.27.2022.02.07.02.32.47 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 07 Feb 2022 02:32:48 -0800 (PST) Message-ID: Date: Mon, 7 Feb 2022 11:32:47 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.4.0 To: Pedro Demarchi Gomes Cc: SeongJae Park , Steven Rostedt , Ingo Molnar , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, John Hubbard References: <20220203131237.298090-1-pedrodemargomes@gmail.com> From: David Hildenbrand Organization: Red Hat Subject: Re: [PATCH] mm/damon: Add option to monitor only writes In-Reply-To: <20220203131237.298090-1-pedrodemargomes@gmail.com> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Stat-Signature: j1ch3gp1omkkyoi7qpw1f7ttdggriqtp X-Rspam-User: nil Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=axKKS0tN; spf=none (imf16.hostedemail.com: domain of david@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 03461180005 X-HE-Tag: 1644229971-23581 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 03.02.22 14:12, Pedro Demarchi Gomes wrote: > When "writes" is written to /sys/kernel/debug/damon/counter_type damon will monitor only writes. > This patch also adds the actions mergeable and unmergeable to damos schemes. These actions are used by KSM as explained in [1]. [...] > > +static inline bool pte_is_pinned(struct vm_area_struct *vma, unsigned long addr, pte_t pte) > +{ > + struct page *page; > + > + if (!pte_write(pte)) > + return false; > + if (!is_cow_mapping(vma->vm_flags)) > + return false; > + if (likely(!test_bit(MMF_HAS_PINNED, &vma->vm_mm->flags))) > + return false; > + page = vm_normal_page(vma, addr, pte); > + if (!page) > + return false; > + return page_maybe_dma_pinned(page); > +} > + > +static inline void clear_soft_dirty_pmd(struct vm_area_struct *vma, > + unsigned long addr, pmd_t *pmdp) > +{ > + pmd_t old, pmd = *pmdp; > + > + if (pmd_present(pmd)) { > + /* See comment in change_huge_pmd() */ > + old = pmdp_invalidate(vma, addr, pmdp); > + if (pmd_dirty(old)) > + pmd = pmd_mkdirty(pmd); > + if (pmd_young(old)) > + pmd = pmd_mkyoung(pmd); > + > + pmd = pmd_wrprotect(pmd); > + pmd = pmd_clear_soft_dirty(pmd); > + > + set_pmd_at(vma->vm_mm, addr, pmdp, pmd); > + } else if (is_migration_entry(pmd_to_swp_entry(pmd))) { > + pmd = pmd_swp_clear_soft_dirty(pmd); > + set_pmd_at(vma->vm_mm, addr, pmdp, pmd); > + } > +} > + > +static inline void clear_soft_dirty(struct vm_area_struct *vma, > + unsigned long addr, pte_t *pte) > +{ > + /* > + * The soft-dirty tracker uses #PF-s to catch writes > + * to pages, so write-protect the pte as well. See the > + * Documentation/admin-guide/mm/soft-dirty.rst for full description > + * of how soft-dirty works. > + */ > + pte_t ptent = *pte; > + > + if (pte_present(ptent)) { > + pte_t old_pte; > + > + if (pte_is_pinned(vma, addr, ptent)) > + return; > + old_pte = ptep_modify_prot_start(vma, addr, pte); > + ptent = pte_wrprotect(old_pte); > + ptent = pte_clear_soft_dirty(ptent); > + ptep_modify_prot_commit(vma, addr, pte, old_pte, ptent); > + } else if (is_swap_pte(ptent)) { > + ptent = pte_swp_clear_soft_dirty(ptent); > + set_pte_at(vma->vm_mm, addr, pte, ptent); > + } > +} Just like clearrefs, this can race against GUP-fast to detect pinned pages. And just like clearrefs, we're not handling PMDs properly. And just like anything that write-protects random anon pages right now, this does not consider O_DIRECT as is. Fortunately, there are not too many users of clearreefs/softdirty tracking out there (my search a while ago returned no open source users). My assumption is that your feature might see more widespread use. Adding more random write protection until we fixed the COW issues [1] really makes my stomach hurt on a Monday morning. Please, let's defer any more features that rely on write-protecting random anon pages until we have ways in place to not corrupt random user space. That is: 1) Teaching the COW logic to not copy pages that are pinned -- I'm working on that. 2) Converting O_DIRECT to use FOLL_PIN instead of FOLL_GET. John is working on that. So I'm not against this change. I'm against this change at this point in time. [1] https://lore.kernel.org/all/3ae33b08-d9ef-f846-56fb-645e3b9b4c66@redhat.com/ -- Thanks, David / dhildenb