From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D0EF6C433EF for ; Mon, 18 Oct 2021 18:37:52 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 7234B6115A for ; Mon, 18 Oct 2021 18:37:52 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 7234B6115A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=ziepe.ca Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 0679F900003; Mon, 18 Oct 2021 14:37:52 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id F32B1900002; Mon, 18 Oct 2021 14:37:51 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DAD58900003; Mon, 18 Oct 2021 14:37:51 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0163.hostedemail.com [216.40.44.163]) by kanga.kvack.org (Postfix) with ESMTP id C9559900002 for ; Mon, 18 Oct 2021 14:37:51 -0400 (EDT) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 81150181B048E for ; Mon, 18 Oct 2021 18:37:51 +0000 (UTC) X-FDA: 78710417142.07.D02622A Received: from mail-qk1-f175.google.com (mail-qk1-f175.google.com [209.85.222.175]) by imf07.hostedemail.com (Postfix) with ESMTP id E595A10000AE for ; Mon, 18 Oct 2021 18:37:53 +0000 (UTC) Received: by mail-qk1-f175.google.com with SMTP id n66so1379168qkn.0 for ; Mon, 18 Oct 2021 11:37:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ziepe.ca; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=wJNcqyK0rI+evQwD2jibSP3vMmIjKUzQYxLr4+LBMxw=; b=mHTSwxuHIb35EXaVeJxHacMsxHXRiwRrzFlTIzu95R4xpKwQjwTi98YJDiO/ogOyVM 1V7to07GJw+HEy22WjoZpWwQLvne2PqoTaWujVU221XSQ58bOy/UsmRz4R02QewOXFks acre2QabSw5LHCBI+pv5SpreClVFg/bJ6nlQ/VS75gnqpOpQ72VMWJUgtPfdSaVZ2NRR aR9NXBrEKrOwNVxyGUgJ3GgJi4d5JiOV3jk9t3w19Mx1HFZ37As2BIG1ofkgahbOsDOr Ld9w6HJk+P8TZXAQcbI/asl8w6v8xhjwIELLnYWoSLB09w6bhkGIiXfr0SSCeYn6edEm cAIA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=wJNcqyK0rI+evQwD2jibSP3vMmIjKUzQYxLr4+LBMxw=; b=ctetPw7zTXyeoNo9hg0B3PRCsu0jyJW0J1RLG2OZvADUA1R0k/Gak37tVaPMj88t08 /MrZkwuXSblG77gL6C5DAzczj7mCG6sXnhnqwu9/9R5z7rtFUsb+VUq/uBj2+wTs/Eky Sg9B8HBprvU/l/pMdtkfxlh9P57C5hFraagoxskZ7tpIoa1l1KOGcAlOiYJ/GCE4199t n1BnZ7Wq39Kad3n15oV8VmR2DsQqjBfojSoibCMOJAHj8i/EDxJGtvRi7IyhTqb93khb PoMQIMuKI5iyMucGYd7wyy5LNvHvey1JgsSTOjfgLJeotvLQOXyocXc61AObHCc5JfTp gqoA== X-Gm-Message-State: AOAM5301/3rEfZQ4U1VF3a2V6ObDfGpd3zJo+P/sxxhl9iR5/tRiNTBB 1k51/Q3BeM/v0vNbyD76YsoJAA== X-Google-Smtp-Source: ABdhPJwqYNU955h8X5OyxNzk5+v0MXjssNIi2HrA1CNfbfl5up52FXcbIilaAETYLXx/PoOFN5pYfg== X-Received: by 2002:a05:620a:1a28:: with SMTP id bk40mr15176090qkb.224.1634582270558; Mon, 18 Oct 2021 11:37:50 -0700 (PDT) Received: from ziepe.ca (hlfxns017vw-142-162-113-129.dhcp-dynamic.fibreop.ns.bellaliant.net. [142.162.113.129]) by smtp.gmail.com with ESMTPSA id bk7sm5189997qkb.72.2021.10.18.11.37.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 18 Oct 2021 11:37:50 -0700 (PDT) Received: from jgg by mlx with local (Exim 4.94) (envelope-from ) id 1mcXW9-00GLve-Gq; Mon, 18 Oct 2021 15:37:49 -0300 Date: Mon, 18 Oct 2021 15:37:49 -0300 From: Jason Gunthorpe To: Joao Martins Cc: Dan Williams , linux-mm@kvack.org, Vishal Verma , Dave Jiang , Naoya Horiguchi , Matthew Wilcox , John Hubbard , Jane Chu , Muchun Song , Mike Kravetz , Andrew Morton , Jonathan Corbet , Christoph Hellwig , nvdimm@lists.linux.dev, linux-doc@vger.kernel.org Subject: Re: [PATCH v4 08/14] mm/gup: grab head page refcount once for group of subpages Message-ID: <20211018183749.GE3686969@ziepe.ca> References: <20210827145819.16471-1-joao.m.martins@oracle.com> <20210827145819.16471-9-joao.m.martins@oracle.com> <20210827162552.GK1200268@ziepe.ca> <20210830130741.GO1200268@ziepe.ca> <20210831170526.GP1200268@ziepe.ca> <8c23586a-eb3b-11a6-e72a-dcc3faad4e96@oracle.com> <20210928180150.GI3544071@ziepe.ca> <3f35cc33-7012-5230-a771-432275e6a21e@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3f35cc33-7012-5230-a771-432275e6a21e@oracle.com> X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: E595A10000AE X-Stat-Signature: 5a6za83ojam1w89iacd46e4z9uoerigk Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=ziepe.ca header.s=google header.b=mHTSwxuH; spf=pass (imf07.hostedemail.com: domain of jgg@ziepe.ca designates 209.85.222.175 as permitted sender) smtp.mailfrom=jgg@ziepe.ca; dmarc=none X-HE-Tag: 1634582273-354425 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Sep 29, 2021 at 12:50:15PM +0100, Joao Martins wrote: > On 9/28/21 19:01, Jason Gunthorpe wrote: > > On Thu, Sep 23, 2021 at 05:51:04PM +0100, Joao Martins wrote: > >> So ... if pgmap accounting was removed from gup-fast then this patch > >> would be a lot simpler and we could perhaps just fallback to the regular > >> hugepage case (THP, HugeTLB) like your suggestion at the top. See at the > >> end below scissors mark as the ballpark of changes. > >> > >> So far my options seem to be: 1) this patch which leverages the existing > >> iteration logic or 2) switching to for_each_compound_range() -- see my previous > >> reply 3) waiting for Dan to remove @pgmap accounting in gup-fast and use > >> something similar to below scissors mark. > >> > >> What do you think would be the best course of action? > > > > I still think the basic algorithm should be to accumulate physicaly > > contiguous addresses when walking the page table and then flush them > > back to struct pages once we can't accumulate any more. > > > > That works for both the walkers and all the page types? > > > > The logic already handles all page types -- I was trying to avoid the extra > complexity in regular hugetlb/THP path by not merging the handling of the > oddball case that is devmap (or fundamentally devmap > non-compound case in the future). FYI, this untested thing is what I came to when I tried to make something like this: /* * A large page entry such as PUD/PMD can point to a struct page. In cases like * THP this struct page will be a compound page of the same order as the page * table level. However, in cases like DAX or more generally pgmap ZONE_DEVICE, * the PUD/PMD may point at the first pfn in a string of pages. * * This helper iterates over all head pages or all the non-compound base pages. */ static pt_entry_iter_state { struct page *head; unsigned long compound_nr; unsigned long pfn; unsigned long end_pfn; }; static inline struct page *__pt_start_iter(struct iter_state *state, struct page *page, unsigned long pfn, unsigned int entry_size) { state->head = compound_head(page); state->compound_nr = compound_nr(page); state->pfn = pfn & (~(state->compound_nr - 1)); state->end_pfn = pfn + entry_size / PAGE_SIZE; return state->head; } static inline struct page *__pt_next_page(struct iter_state *state) { state->pfn += state->compound_nr; if (state->end_pfn <= state->pfn) return NULL; state->head = pfn_to_page(state->pfn); state->compound_nr = compound_nr(page); return state->head; } #define for_each_page_in_pt_entry(state, page, pfn, entry_size) \ for (page = __pt_start_iter(state, page, pfn, entry_size); page; \ page = __pt_next_page(&state)) static bool remove_pages_from_page_table(struct vm_area_struct *vma, struct page *page, unsigned long pfn, unsigned int entry_size, bool is_dirty, bool is_young) { struct iter_state state; for_each_page_in_pt_entry(&state, page, pfn, entry_size) remove_page_from_page_table(vma, page, is_dirty, is_young); } Jason