From: Alistair Popple <apopple@nvidia.com>
To: Jan Kara <jack@suse.cz>
Cc: dan.j.williams@intel.com, vishal.l.verma@intel.com,
dave.jiang@intel.com, logang@deltatee.com, bhelgaas@google.com,
jgg@ziepe.ca, catalin.marinas@arm.com, will@kernel.org,
mpe@ellerman.id.au, npiggin@gmail.com,
dave.hansen@linux.intel.com, ira.weiny@intel.com,
willy@infradead.org, djwong@kernel.org, tytso@mit.edu,
linmiaohe@huawei.com, david@redhat.com, peterx@redhat.com,
linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-arm-kernel@lists.infradead.org,
linuxppc-dev@lists.ozlabs.org, nvdimm@lists.linux.dev,
linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org,
linux-mm@kvack.org, linux-ext4@vger.kernel.org,
linux-xfs@vger.kernel.org, jhubbard@nvidia.com, hch@lst.de,
david@fromorbit.com
Subject: Re: [PATCH 06/13] mm/memory: Add dax_insert_pfn
Date: Fri, 06 Sep 2024 16:21:53 +1000 [thread overview]
Message-ID: <87seudb8nm.fsf@nvdebian.thelocal> (raw)
In-Reply-To: <20240627113328.ozqkzhloufrpsdcr@quack3>
Jan Kara <jack@suse.cz> writes:
> On Thu 27-06-24 10:54:21, Alistair Popple wrote:
>> Currently to map a DAX page the DAX driver calls vmf_insert_pfn. This
>> creates a special devmap PTE entry for the pfn but does not take a
>> reference on the underlying struct page for the mapping. This is
>> because DAX page refcounts are treated specially, as indicated by the
>> presence of a devmap entry.
>>
>> To allow DAX page refcounts to be managed the same as normal page
>> refcounts introduce dax_insert_pfn. This will take a reference on the
>> underlying page much the same as vmf_insert_page, except it also
>> permits upgrading an existing mapping to be writable if
>> requested/possible.
>>
>> Signed-off-by: Alistair Popple <apopple@nvidia.com>
>
> Overall this looks good to me. Some comments below.
>
>> ---
>> include/linux/mm.h | 4 ++-
>> mm/memory.c | 79 ++++++++++++++++++++++++++++++++++++++++++-----
>> 2 files changed, 76 insertions(+), 7 deletions(-)
>>
>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>> index 9a5652c..b84368b 100644
>> --- a/include/linux/mm.h
>> +++ b/include/linux/mm.h
>> @@ -1080,6 +1080,8 @@ int vma_is_stack_for_current(struct vm_area_struct *vma);
>> struct mmu_gather;
>> struct inode;
>>
>> +extern void prep_compound_page(struct page *page, unsigned int order);
>> +
>
> You don't seem to use this function in this patch?
Thanks, bad rebase splitting this up. It belongs later in the series.
>> diff --git a/mm/memory.c b/mm/memory.c
>> index ce48a05..4f26a1f 100644
>> --- a/mm/memory.c
>> +++ b/mm/memory.c
>> @@ -1989,14 +1989,42 @@ static int validate_page_before_insert(struct page *page)
>> }
>>
>> static int insert_page_into_pte_locked(struct vm_area_struct *vma, pte_t *pte,
>> - unsigned long addr, struct page *page, pgprot_t prot)
>> + unsigned long addr, struct page *page, pgprot_t prot, bool mkwrite)
>> {
>> struct folio *folio = page_folio(page);
>> + pte_t entry = ptep_get(pte);
>>
>> - if (!pte_none(ptep_get(pte)))
>> + if (!pte_none(entry)) {
>> + if (mkwrite) {
>> + /*
>> + * For read faults on private mappings the PFN passed
>> + * in may not match the PFN we have mapped if the
>> + * mapped PFN is a writeable COW page. In the mkwrite
>> + * case we are creating a writable PTE for a shared
>> + * mapping and we expect the PFNs to match. If they
>> + * don't match, we are likely racing with block
>> + * allocation and mapping invalidation so just skip the
>> + * update.
>> + */
>> + if (pte_pfn(entry) != page_to_pfn(page)) {
>> + WARN_ON_ONCE(!is_zero_pfn(pte_pfn(entry)));
>> + return -EFAULT;
>> + }
>> + entry = maybe_mkwrite(entry, vma);
>> + entry = pte_mkyoung(entry);
>> + if (ptep_set_access_flags(vma, addr, pte, entry, 1))
>> + update_mmu_cache(vma, addr, pte);
>> + return 0;
>> + }
>> return -EBUSY;
>
> If you do this like:
>
> if (!mkwrite)
> return -EBUSY;
>
> You can reduce indentation of the big block and also making the flow more
> obvious...
Good idea.
>> + }
>> +
>> /* Ok, finally just insert the thing.. */
>> folio_get(folio);
>> + if (mkwrite)
>> + entry = maybe_mkwrite(mk_pte(page, prot), vma);
>> + else
>> + entry = mk_pte(page, prot);
>
> I'd prefer:
>
> entry = mk_pte(page, prot);
> if (mkwrite)
> entry = maybe_mkwrite(entry, vma);
>
> but I don't insist. Also insert_pfn() additionally has pte_mkyoung() and
> pte_mkdirty(). Why was it left out here?
An oversight by me, thanks for pointing it out!
> Honza
next prev parent reply other threads:[~2024-09-06 6:25 UTC|newest]
Thread overview: 107+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-27 0:54 [PATCH 00/13] fs/dax: Fix FS DAX page reference counts Alistair Popple
2024-06-27 0:54 ` Alistair Popple
2024-06-27 0:54 ` [PATCH 01/13] mm/gup.c: Remove redundant check for PCI P2PDMA page Alistair Popple
2024-06-27 0:54 ` Alistair Popple
2024-06-27 6:36 ` Dan Williams
2024-06-27 6:36 ` Dan Williams
2024-06-27 0:54 ` [PATCH 02/13] pci/p2pdma: Don't initialise page refcount to one Alistair Popple
2024-06-27 0:54 ` Alistair Popple
2024-06-27 5:30 ` Christoph Hellwig
2024-06-27 5:30 ` Christoph Hellwig
2024-06-29 21:28 ` Bjorn Helgaas
2024-06-29 21:28 ` Bjorn Helgaas
2024-06-27 0:54 ` [PATCH 03/13] fs/dax: Refactor wait for dax idle page Alistair Popple
2024-06-27 0:54 ` Alistair Popple
2024-06-27 5:31 ` Christoph Hellwig
2024-06-27 5:31 ` Christoph Hellwig
2024-06-27 0:54 ` [PATCH 04/13] fs/dax: Add dax_page_free callback Alistair Popple
2024-06-27 0:54 ` Alistair Popple
2024-06-27 5:33 ` Christoph Hellwig
2024-06-27 5:33 ` Christoph Hellwig
2024-06-27 23:48 ` Alistair Popple
2024-06-27 23:48 ` Alistair Popple
2024-06-27 0:54 ` [PATCH 05/13] mm: Allow compound zone device pages Alistair Popple
2024-06-27 0:54 ` Alistair Popple
2024-06-27 5:35 ` Christoph Hellwig
2024-06-27 5:35 ` Christoph Hellwig
2024-06-27 0:54 ` [PATCH 06/13] mm/memory: Add dax_insert_pfn Alistair Popple
2024-06-27 0:54 ` Alistair Popple
2024-06-27 5:22 ` Christoph Hellwig
2024-06-27 5:22 ` Christoph Hellwig
2024-06-27 11:33 ` Jan Kara
2024-06-27 11:33 ` Jan Kara
2024-09-06 6:21 ` Alistair Popple [this message]
2024-07-02 7:18 ` David Hildenbrand
2024-07-02 7:18 ` David Hildenbrand
2024-07-02 10:47 ` Alistair Popple
2024-07-02 10:47 ` Alistair Popple
2024-07-02 11:46 ` Christoph Hellwig
2024-07-02 11:46 ` Christoph Hellwig
2024-07-02 11:53 ` David Hildenbrand
2024-07-02 11:53 ` David Hildenbrand
2024-06-27 0:54 ` [PATCH 07/13] huge_memory: Allow mappings of PUD sized pages Alistair Popple
2024-06-27 0:54 ` Alistair Popple
2024-06-27 22:26 ` kernel test robot
2024-06-27 22:26 ` kernel test robot
2024-07-02 7:16 ` David Hildenbrand
2024-07-02 7:16 ` David Hildenbrand
2024-07-02 10:19 ` Alistair Popple
2024-07-02 10:19 ` Alistair Popple
2024-07-02 11:02 ` David Hildenbrand
2024-07-02 11:02 ` David Hildenbrand
2024-07-02 11:30 ` Alistair Popple
2024-07-02 11:30 ` Alistair Popple
2024-07-02 13:01 ` David Hildenbrand
2024-07-02 13:01 ` David Hildenbrand
2024-07-02 11:51 ` Christoph Hellwig
2024-07-02 11:51 ` Christoph Hellwig
2024-07-02 12:22 ` Eliot Moss
2024-06-27 0:54 ` [PATCH 08/13] huge_memory: Allow mappings of PMD " Alistair Popple
2024-06-27 0:54 ` Alistair Popple
2024-06-27 0:54 ` [PATCH 09/13] gup: Don't allow FOLL_LONGTERM pinning of FS DAX pages Alistair Popple
2024-06-27 0:54 ` Alistair Popple
2024-07-01 8:59 ` David Hildenbrand
2024-07-01 8:59 ` David Hildenbrand
2024-07-01 23:47 ` Alistair Popple
2024-07-01 23:47 ` Alistair Popple
2024-07-02 10:48 ` David Hildenbrand
2024-07-02 10:48 ` David Hildenbrand
2024-06-27 0:54 ` [PATCH 10/13] fs/dax: Properly refcount fs dax pages Alistair Popple
2024-06-27 0:54 ` Alistair Popple
2024-06-27 5:44 ` Christoph Hellwig
2024-06-27 5:44 ` Christoph Hellwig
2024-09-06 6:00 ` Alistair Popple
2024-06-27 0:54 ` [PATCH 11/13] huge_memory: Remove dead vmf_insert_pXd code Alistair Popple
2024-06-27 0:54 ` Alistair Popple
2024-07-05 14:24 ` Peter Xu
2024-07-05 14:24 ` Peter Xu
2024-07-09 4:07 ` Alistair Popple
2024-07-09 4:07 ` Alistair Popple
2024-07-09 15:56 ` Peter Xu
2024-07-09 15:56 ` Peter Xu
2024-07-12 2:40 ` Alistair Popple
2024-07-12 2:40 ` Alistair Popple
2024-07-12 15:52 ` Peter Xu
2024-07-12 15:52 ` Peter Xu
2024-06-27 0:54 ` [PATCH 12/13] mm: Remove pXX_devmap callers Alistair Popple
2024-06-27 0:54 ` Alistair Popple
2024-06-27 0:54 ` [PATCH 13/13] mm: Remove devmap related functions and page table bits Alistair Popple
2024-06-27 0:54 ` Alistair Popple
2024-06-27 23:04 ` kernel test robot
2024-06-27 23:04 ` kernel test robot
2024-06-28 2:12 ` kernel test robot
2024-06-28 2:12 ` kernel test robot
2024-07-08 11:35 ` Will Deacon
2024-07-08 11:35 ` Will Deacon
2024-06-27 6:58 ` [PATCH 00/13] fs/dax: Fix FS DAX page reference counts Dan Williams
2024-06-27 6:58 ` Dan Williams
2024-06-27 7:15 ` Alistair Popple
2024-06-27 7:15 ` Alistair Popple
2024-06-27 20:24 ` Dan Williams
2024-06-27 20:24 ` Dan Williams
2024-06-28 0:06 ` Alistair Popple
2024-06-28 0:06 ` Alistair Popple
2024-07-01 4:24 ` Dave Chinner
2024-07-01 4:24 ` Dave Chinner
2024-07-01 8:33 ` Alistair Popple
2024-07-01 8:33 ` Alistair Popple
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87seudb8nm.fsf@nvdebian.thelocal \
--to=apopple@nvidia.com \
--cc=bhelgaas@google.com \
--cc=catalin.marinas@arm.com \
--cc=dan.j.williams@intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=dave.jiang@intel.com \
--cc=david@fromorbit.com \
--cc=david@redhat.com \
--cc=djwong@kernel.org \
--cc=hch@lst.de \
--cc=ira.weiny@intel.com \
--cc=jack@suse.cz \
--cc=jgg@ziepe.ca \
--cc=jhubbard@nvidia.com \
--cc=linmiaohe@huawei.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-xfs@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=logang@deltatee.com \
--cc=mpe@ellerman.id.au \
--cc=npiggin@gmail.com \
--cc=nvdimm@lists.linux.dev \
--cc=peterx@redhat.com \
--cc=tytso@mit.edu \
--cc=vishal.l.verma@intel.com \
--cc=will@kernel.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.