From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 48013C2BD09 for ; Thu, 27 Jun 2024 05:45:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C705B6B0082; Thu, 27 Jun 2024 01:45:02 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C1FD56B0083; Thu, 27 Jun 2024 01:45:02 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AE7466B0088; Thu, 27 Jun 2024 01:45:02 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 8F86E6B0082 for ; Thu, 27 Jun 2024 01:45:02 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 3740C41A95 for ; Thu, 27 Jun 2024 05:45:02 +0000 (UTC) X-FDA: 82275580044.18.4E5D8A9 Received: from verein.lst.de (verein.lst.de [213.95.11.211]) by imf15.hostedemail.com (Postfix) with ESMTP id 6968FA001B for ; Thu, 27 Jun 2024 05:45:00 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf15.hostedemail.com: domain of hch@lst.de designates 213.95.11.211 as permitted sender) smtp.mailfrom=hch@lst.de ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1719467084; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ramgVCCQWB8g0WLV6ItptqESVIgfjHuJplPPwD2z6p0=; b=Li5b65jRV2a4jhX+/88QcG1Yj6VJXihgvhTudmgkAv3HuZRV0eRjkmWS35IHR2c9IkYV2M HIGSdZszyTu00UTq9Am/9xVhXfxO/ghg5IoI4Yil3ZmuUaXVmdBdqfX+dN+5V4N3ZOSCZ4 /sNi7bZ9GXp1Icr4AWMbrW/qCKddDwQ= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1719467084; a=rsa-sha256; cv=none; b=Bn5fXsTKmHECkxXUalDcsgzYTEfg4zaexdDSrSEqlT6khuzLFmcyyu9l2EZe9idVSLd1Bd mbi12MwHHsNvrwHny4swjBB8h1SqDdtYtgq9swISSg6nj3OpguYG0yEHOoNEgtJXrDdvKY Aaqx0AXI3WM6IZ3kcTz44XF9CghGnM4= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf15.hostedemail.com: domain of hch@lst.de designates 213.95.11.211 as permitted sender) smtp.mailfrom=hch@lst.de Received: by verein.lst.de (Postfix, from userid 2407) id 70CDD68AFE; Thu, 27 Jun 2024 07:44:55 +0200 (CEST) Date: Thu, 27 Jun 2024 07:44:55 +0200 From: Christoph Hellwig To: Alistair Popple Cc: dan.j.williams@intel.com, vishal.l.verma@intel.com, dave.jiang@intel.com, logang@deltatee.com, bhelgaas@google.com, jack@suse.cz, jgg@ziepe.ca, catalin.marinas@arm.com, will@kernel.org, mpe@ellerman.id.au, npiggin@gmail.com, dave.hansen@linux.intel.com, ira.weiny@intel.com, willy@infradead.org, djwong@kernel.org, tytso@mit.edu, linmiaohe@huawei.com, david@redhat.com, peterx@redhat.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, jhubbard@nvidia.com, hch@lst.de, david@fromorbit.com Subject: Re: [PATCH 10/13] fs/dax: Properly refcount fs dax pages Message-ID: <20240627054455.GF14837@lst.de> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.17 (2007-11-01) X-Rspamd-Queue-Id: 6968FA001B X-Stat-Signature: jnho4y99oxz1qdqtdybj8q4s3o8t1fnp X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1719467100-512534 X-HE-Meta: U2FsdGVkX1/ZkzCQ0Rre67LFP2CMJPcS5LT+zOwiDtgujZs6SwYM6UsKvy66Vm9lg7TZ0us3LIynmWrMWnVs01m0V1LtHtUBbXnnfOyOdgm8GDVFy03AJ7QBhGkFgdVPI2NiqOdZp6qEGevLTjHtom2N2ClGcZ1j9ogC+NBd9kDjSH/5Az5GTKX2RSB6h4naXA66Ef1WBi1IC3Hd2rf5nxtfeFCjYiyRqhuF+zGXMGRc5BEkM1Ft3Bo8eFy0sMHqWV7XFqax63nYrziHSsp4uqGgLTDcUuUTnC0c91dKWMExmzHohqNiBFgg3fatcqb3HsAba5DGkTM14fOfKLOAAu8bBU4zs7LZlvScS4NwCHNJJYVo38G8oNcpACD0Y9CaEW1OWPJ7hb5p7NHalAl6W0QAT0C589xaP11C+LWzCAXGQTYUkmsIakhpr5YeqekR4dbRtpW63NKBmLH5SD3+dzXApEhmIvyClYuIELZwR7A+D3Uj4VwHxCyBUFSjf94ER3ZNpDffaXX7ImDjRu6C3/F7MwkeYRNJncAmuYAl0muyDRMZFJw8P2utqAfcAvNd/oIsYHdHodgzP9wXAdKGIMbWEs5iGvRWApALj01z4GqXkIhJo5iYtzJR6dj8DfU6XaXvo6S56xRKonj6dsbWwvlMB4in5kiN6Nm2uDiqm28IqJmZZsg78iisDCP5BlaZy1uRHXwNTvcN0KbUhbBeoOGGCxOQHwXU+hf0Xy8vyimgIhu+x1qgN4gC7/Wq4zEhnRD4i0EphTUa2CGjjS8VpjtD/Vn1Avi7QpZr5sJx5UGSUNbq40NGiQVhmKVHBG12nnj7roqpQpbIT/y7OahB+MFT35yE/d3rGfWxTNZ6YxezNxN6dTDudbmactyGfOkAVJKL7YW1bEWk1DOk4Z++MZdgmxgktZj+9UflQpA2Imz5YJHw1au9mKaK2uufQA905gNZ+Q12W9VaQZV0wbH Dnk4UAkF vYQRR19+mw4WWXdeyvXBxOf+UOZVhrTNp2ljAT461Bc7U7wmfizjv770oenW7wiZvVeiru8QcwF7Z6J0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: > diff --git a/drivers/dax/device.c b/drivers/dax/device.c > index eb61598..b7a31ae 100644 > --- a/drivers/dax/device.c > +++ b/drivers/dax/device.c > @@ -126,11 +126,11 @@ static vm_fault_t __dev_dax_pte_fault(struct dev_dax *dev_dax, > return VM_FAULT_SIGBUS; > } > > - pfn = phys_to_pfn_t(phys, PFN_DEV|PFN_MAP); > + pfn = phys_to_pfn_t(phys, 0); > > dax_set_mapping(vmf, pfn, fault_size); > > - return vmf_insert_mixed(vmf->vma, vmf->address, pfn); > + return dax_insert_pfn(vmf->vma, vmf->address, pfn, vmf->flags & FAULT_FLAG_WRITE); Plenty overly long lines here and later. Q: hould dax_insert_pfn take a vm_fault structure instead of the vma? Or are the potential use cases that aren't from the fault path? similar instead of the bool write passing the fault flags might actually make things more readable than the bool. Also at least currently it seems like there are no modular users despite the export, or am I missing something? > + blk_queue_flag_set(QUEUE_FLAG_DAX, q); Just as a heads up, setting of these flags has changed a lot in linux-next. > { > + /* > + * Make sure we flush any cached data to the page now that it's free. > + */ > + if (PageDirty(page)) > + dax_flush(NULL, page_address(page), page_size(page)); > + Adding the magic dax_dev == NULL case to dax_flush and going through it vs just calling arch_wb_cache_pmem directly here seems odd. But I also don't quite understand how it is related to the rest of the patch anyway. > --- a/mm/mlock.c > +++ b/mm/mlock.c > @@ -373,6 +373,8 @@ static int mlock_pte_range(pmd_t *pmd, unsigned long addr, > unsigned long start = addr; > > ptl = pmd_trans_huge_lock(pmd, vma); > + if (vma_is_dax(vma)) > + ptl = NULL; > if (ptl) { This feels sufficiently magic to warrant a comment. > if (!pmd_present(*pmd)) > goto out; > diff --git a/mm/mm_init.c b/mm/mm_init.c > index b7e1599..f11ee0d 100644 > --- a/mm/mm_init.c > +++ b/mm/mm_init.c > @@ -1016,7 +1016,8 @@ static void __ref __init_zone_device_page(struct page *page, unsigned long pfn, > */ > if (pgmap->type == MEMORY_DEVICE_PRIVATE || > pgmap->type == MEMORY_DEVICE_COHERENT || > - pgmap->type == MEMORY_DEVICE_PCI_P2PDMA) > + pgmap->type == MEMORY_DEVICE_PCI_P2PDMA || > + pgmap->type == MEMORY_DEVICE_FS_DAX) > set_page_count(page, 0); > } So we'll skip this for MEMORY_DEVICE_GENERIC only. Does anyone remember if that's actively harmful or just not needed? If the latter it might be simpler to just set the page count unconditionally here.