From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 31894C4332F for ; Wed, 16 Nov 2022 22:18:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AFDD88E0001; Wed, 16 Nov 2022 17:18:09 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id AAD586B007B; Wed, 16 Nov 2022 17:18:09 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 975548E0001; Wed, 16 Nov 2022 17:18:09 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 8754C6B0078 for ; Wed, 16 Nov 2022 17:18:09 -0500 (EST) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 5FBDEA0C29 for ; Wed, 16 Nov 2022 22:18:09 +0000 (UTC) X-FDA: 80140719498.07.D31F073 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf27.hostedemail.com (Postfix) with ESMTP id 7072540011 for ; Wed, 16 Nov 2022 22:18:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1668637086; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=eLacVH9A13gWTXV0mKgk98a/BsqmzWChzF1U5/Q2nY4=; b=hZDn7GpGjgzDvHL6hEQZ8aLKhT5r03vfI97Qbaqst9bt5SCUYXCMtxmX9Ujmp3jGuTGlHM 3GPDWWLKB/Tvo9d4wWdZ2tPoFOzFgMq2mQF5bv3wxacvf2CNnd06J4FyyMQl+OfZN+r1PP eim8dOCZCKroMabeDDbT8fHDppO6ZIQ= Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-542-8OJg3IwsOGG9B1ApauEXFA-1; Wed, 16 Nov 2022 17:18:05 -0500 X-MC-Unique: 8OJg3IwsOGG9B1ApauEXFA-1 Received: by mail-qt1-f200.google.com with SMTP id g3-20020ac84b63000000b003a529c62a92so1246qts.23 for ; Wed, 16 Nov 2022 14:18:05 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=eLacVH9A13gWTXV0mKgk98a/BsqmzWChzF1U5/Q2nY4=; b=PhVHQoRcRr/Q8kGaXCcEUaHB/SxhRKXFnUKcWGmy3TCuKUijh+/5mePeV31VqaHAtH Zk7QDNE2qF40OpMXlWnu8dvVEZkKubvqAYjfJwwj/jWizWdvOv2XEykraAi04NWxssXg S1llRnSVukxPeaKAseKngCvRiKG5ak23CskB8p8QOZ9ufieVTZmqvNWLag+3MIpsIfwN oe03yMxyRMbRtb7zhrLlt1O3wLAfy3vSVcPtIYi4yLVvIH428PufG2+VtEd+x1gDDmw2 SBg3MRsZejJ6oe3QApnBCz86n6kYfJdZ5AkVm856Q5tLRby+f9MRyuZ1G5XTduyBr5yi EfLw== X-Gm-Message-State: ANoB5pm/xAz9CcMRFN9OiZbL9Ts7duSAVt07RV4/qtVhA/MWmVEqT7S8 NCeQ+cLxuxHj3CdCyi5MB8zMVfF4GfFeTiwWcEgnj8I9XXq6rUddcVu8eEy2G5+W5S9upI4pd60 NQur7n3eivbI= X-Received: by 2002:ac8:44ab:0:b0:3a5:4859:8176 with SMTP id a11-20020ac844ab000000b003a548598176mr23466897qto.478.1668637085204; Wed, 16 Nov 2022 14:18:05 -0800 (PST) X-Google-Smtp-Source: AA0mqf5IVIYGjvz+hOM8t/mLeUfxGEfTJ1qWUuypNOzMzWaqU2N3kMAKsrFEcgvZj34OH1189VBcaQ== X-Received: by 2002:ac8:44ab:0:b0:3a5:4859:8176 with SMTP id a11-20020ac844ab000000b003a548598176mr23466874qto.478.1668637084937; Wed, 16 Nov 2022 14:18:04 -0800 (PST) Received: from x1n (bras-base-aurron9127w-grc-46-70-31-27-79.dsl.bell.ca. [70.31.27.79]) by smtp.gmail.com with ESMTPSA id h21-20020ac846d5000000b003a4f22c6507sm9441673qto.48.2022.11.16.14.18.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 16 Nov 2022 14:18:00 -0800 (PST) Date: Wed, 16 Nov 2022 17:17:59 -0500 From: Peter Xu To: James Houghton Cc: Mike Kravetz , Muchun Song , David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , Zach O'Keefe , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [RFC PATCH v2 10/47] hugetlb: add hugetlb_pte to track HugeTLB page table entries Message-ID: References: <20221021163703.3218176-1-jthoughton@google.com> <20221021163703.3218176-11-jthoughton@google.com> MIME-Version: 1.0 In-Reply-To: <20221021163703.3218176-11-jthoughton@google.com> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1668637089; a=rsa-sha256; cv=none; b=44QoSy7+MJVusrr4VKUYda6tUBV5oI+DiNGyHRy0+m8Qw5a7D3zer88JTU6TbdYuc3QrXG MFkt9DvReP7k7hRSD0neZZXf1MvZSmeiyT6k4ZwkHzpWyuf3A6v1jkd6niOx0UNHXE6JHQ EZd5/i3JjVjmqOOudijzdNOtAZl69TI= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=hZDn7GpG; spf=pass (imf27.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1668637089; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=eLacVH9A13gWTXV0mKgk98a/BsqmzWChzF1U5/Q2nY4=; b=Ni16R+Te9V41/XCCLSPaZo14cPKfknVbkyvHt9hjGjEW1hkcPoOJzFAVHlfFyxhieEwgCd B6yEiupRrOMiBJXo4Lblkp0M2CfD3I8ca1rOuz4T+emenVMXIqw7z4hm51UlKVHB8ZlvQu 1Fr0TZrmS/3mlRZv0T87wHkPO0YZ6P8= X-Stat-Signature: 5oext8b5fkigydhes6uhwfpkiusjx7oc X-Rspamd-Queue-Id: 7072540011 Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=hZDn7GpG; spf=pass (imf27.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1668637087-302221 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Oct 21, 2022 at 04:36:26PM +0000, James Houghton wrote: > +struct hugetlb_pte { > + pte_t *ptep; > + unsigned int shift; > + enum hugetlb_level level; > + spinlock_t *ptl; > +}; Do we need both shift + level? Maybe it's only meaningful for ARM where the shift may not be directly calculcated from level? I'm wondering whether we can just maintain "shift" then we calculate "level" realtime. It just reads a bit weird to have these two fields, also a burden to most of the call sites where shift and level exactly match.. > + > +static inline > +void hugetlb_pte_populate(struct hugetlb_pte *hpte, pte_t *ptep, > + unsigned int shift, enum hugetlb_level level) I'd think it's nicer to replace "populate" with something else, as populate is definitely a meaningful word in vm world for "making something appear if it wasn't". Maybe hugetlb_pte_setup()? Even one step back, on the naming of hugetlb_pte.. Sorry to comment on namings especially on this one, I really don't like to do that normally.. but here hugetlb_pte only walks the sub-page level of pgtables, meanwhile it's not really a pte but an iterator. How about hugetlb_hgm_iter? "hgm" tells that it only walks sub-level, and "iter" tells that it is an iterator, being updated for each stepping downwards. Then hugetlb_pte_populate() can be hugetlb_hgm_iter_init(). Take these comments with a grain of salt, and it never hurts to wait for a 2nd opinion before anything. > +{ > + WARN_ON_ONCE(!ptep); > + hpte->ptep = ptep; > + hpte->shift = shift; > + hpte->level = level; > + hpte->ptl = NULL; > +} > + > +static inline > +unsigned long hugetlb_pte_size(const struct hugetlb_pte *hpte) > +{ > + WARN_ON_ONCE(!hpte->ptep); > + return 1UL << hpte->shift; > +} > + > +static inline > +unsigned long hugetlb_pte_mask(const struct hugetlb_pte *hpte) > +{ > + WARN_ON_ONCE(!hpte->ptep); > + return ~(hugetlb_pte_size(hpte) - 1); > +} > + > +static inline > +unsigned int hugetlb_pte_shift(const struct hugetlb_pte *hpte) > +{ > + WARN_ON_ONCE(!hpte->ptep); > + return hpte->shift; > +} > + > +static inline > +enum hugetlb_level hugetlb_pte_level(const struct hugetlb_pte *hpte) > +{ > + WARN_ON_ONCE(!hpte->ptep); There're definitely a bunch of hpte->ptep WARN_ON_ONCE()s.. AFAIK the hugetlb_pte* will be setup once with valid ptep and then it should always be. I rem someone commented on these helpers look not useful, which I must confess I had the same feeling. But besides that, I'd rather drop all these WARN_ON_ONCE()s but only check it when init() the iterator/pte. > + return hpte->level; > +} > + > +static inline > +void hugetlb_pte_copy(struct hugetlb_pte *dest, const struct hugetlb_pte *src) > +{ > + dest->ptep = src->ptep; > + dest->shift = src->shift; > + dest->level = src->level; > + dest->ptl = src->ptl; > +} > + > +bool hugetlb_pte_present_leaf(const struct hugetlb_pte *hpte, pte_t pte); > + > struct hugepage_subpool { > spinlock_t lock; > long count; > @@ -1210,6 +1279,25 @@ static inline spinlock_t *huge_pte_lock(struct hstate *h, > return ptl; > } > > +static inline > +spinlock_t *hugetlb_pte_lockptr(struct mm_struct *mm, struct hugetlb_pte *hpte) > +{ > + > + BUG_ON(!hpte->ptep); Another BUG_ON(); better be dropped too. > + if (hpte->ptl) > + return hpte->ptl; > + return huge_pte_lockptr(hugetlb_pte_shift(hpte), mm, hpte->ptep); I'm curious whether we can always have hpte->ptl set for a valid hugetlb_pte. I think that means we'll need to also init the ptl in the init() fn of the iterator. Then it'll be clear on which lock to take for each valid hugetlb_pte. > +} -- Peter Xu