From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D67B7C433EF for ; Wed, 17 Nov 2021 09:43:33 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 8C46461357 for ; Wed, 17 Nov 2021 09:43:33 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 8C46461357 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=lst.de Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 110366B006C; Wed, 17 Nov 2021 04:43:23 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 098FD6B0073; Wed, 17 Nov 2021 04:43:23 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E7B4F6B0074; Wed, 17 Nov 2021 04:43:22 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0090.hostedemail.com [216.40.44.90]) by kanga.kvack.org (Postfix) with ESMTP id D0BC66B006C for ; Wed, 17 Nov 2021 04:43:22 -0500 (EST) Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 9392D18394169 for ; Wed, 17 Nov 2021 09:43:12 +0000 (UTC) X-FDA: 78817933824.28.8AFCBEC Received: from verein.lst.de (verein.lst.de [213.95.11.211]) by imf24.hostedemail.com (Postfix) with ESMTP id 9E529B0000B6 for ; Wed, 17 Nov 2021 09:43:10 +0000 (UTC) Received: by verein.lst.de (Postfix, from userid 2407) id 098C468C7B; Wed, 17 Nov 2021 10:43:09 +0100 (CET) Date: Wed, 17 Nov 2021 10:43:08 +0100 From: Christoph Hellwig To: Joao Martins Cc: linux-mm@kvack.org, Dan Williams , Vishal Verma , Dave Jiang , Naoya Horiguchi , Matthew Wilcox , Jason Gunthorpe , John Hubbard , Jane Chu , Muchun Song , Mike Kravetz , Andrew Morton , Jonathan Corbet , Christoph Hellwig , nvdimm@lists.linux.dev, linux-doc@vger.kernel.org Subject: Re: [PATCH v5 8/8] device-dax: compound devmap support Message-ID: <20211117094308.GC8429@lst.de> References: <20211112150824.11028-1-joao.m.martins@oracle.com> <20211112150824.11028-9-joao.m.martins@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20211112150824.11028-9-joao.m.martins@oracle.com> User-Agent: Mutt/1.5.17 (2007-11-01) X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 9E529B0000B6 X-Stat-Signature: ghtqrn5785tbp7wyx8fpn4844kgebi9w Authentication-Results: imf24.hostedemail.com; dkim=none; dmarc=none; spf=none (imf24.hostedemail.com: domain of hch@lst.de has no SPF policy when checking 213.95.11.211) smtp.mailfrom=hch@lst.de X-HE-Tag: 1637142190-653185 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Nov 12, 2021 at 04:08:24PM +0100, Joao Martins wrote: > Use the newly added compound devmap facility which maps the assigned dax > ranges as compound pages at a page size of @align. > > dax devices are created with a fixed @align (huge page size) which is > enforced through as well at mmap() of the device. Faults, consequently > happen too at the specified @align specified at the creation, and those > don't change throughout dax device lifetime. MCEs unmap a whole dax > huge page, as well as splits occurring at the configured page size. > > Performance measured by gup_test improves considerably for > unpin_user_pages() and altmap with NVDIMMs: > > $ gup_test -f /dev/dax1.0 -m 16384 -r 10 -S -a -n 512 -w > (pin_user_pages_fast 2M pages) put:~71 ms -> put:~22 ms > [altmap] > (pin_user_pages_fast 2M pages) get:~524ms put:~525 ms -> get: ~127ms put:~71ms > > $ gup_test -f /dev/dax1.0 -m 129022 -r 10 -S -a -n 512 -w > (pin_user_pages_fast 2M pages) put:~513 ms -> put:~188 ms > [altmap with -m 127004] > (pin_user_pages_fast 2M pages) get:~4.1 secs put:~4.12 secs -> get:~1sec put:~563ms > > .. as well as unpin_user_page_range_dirty_lock() being just as effective > as THP/hugetlb[0] pages. > > [0] https://lore.kernel.org/linux-mm/20210212130843.13865-5-joao.m.martins@oracle.com/ > > Signed-off-by: Joao Martins > Reviewed-by: Dan Williams > --- > drivers/dax/device.c | 57 ++++++++++++++++++++++++++++++++++---------- > 1 file changed, 44 insertions(+), 13 deletions(-) > > diff --git a/drivers/dax/device.c b/drivers/dax/device.c > index a65c67ab5ee0..0c2ac97d397d 100644 > --- a/drivers/dax/device.c > +++ b/drivers/dax/device.c > @@ -192,6 +192,42 @@ static vm_fault_t __dev_dax_pud_fault(struct dev_dax *dev_dax, > } > #endif /* !CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD */ > > +static void set_page_mapping(struct vm_fault *vmf, pfn_t pfn, > + unsigned long fault_size, > + struct address_space *f_mapping) > +{ > + unsigned long i; > + pgoff_t pgoff; > + > + pgoff = linear_page_index(vmf->vma, ALIGN(vmf->address, fault_size)); > + > + for (i = 0; i < fault_size / PAGE_SIZE; i++) { > + struct page *page; > + > + page = pfn_to_page(pfn_t_to_pfn(pfn) + i); > + if (page->mapping) > + continue; > + page->mapping = f_mapping; > + page->index = pgoff + i; > + } > +} No need to pass f_mapping here, it must be vmf->vma->vm_file->f_mapping. > +static void set_compound_mapping(struct vm_fault *vmf, pfn_t pfn, > + unsigned long fault_size, > + struct address_space *f_mapping) > +{ > + struct page *head; > + > + head = pfn_to_page(pfn_t_to_pfn(pfn)); > + head = compound_head(head); > + if (head->mapping) > + return; > + > + head->mapping = f_mapping; > + head->index = linear_page_index(vmf->vma, > + ALIGN(vmf->address, fault_size)); > +} Same here. > if (rc == VM_FAULT_NOPAGE) { > - unsigned long i; > - pgoff_t pgoff; > + struct dev_pagemap *pgmap = dev_dax->pgmap; If you're touching this anyway: why not do an early return here for the error case to simplify the flow?