From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.4 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BAB07C10DCE for ; Fri, 13 Mar 2020 22:52:01 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6A0532074D for ; Fri, 13 Mar 2020 22:52:01 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=ziepe.ca header.i=@ziepe.ca header.b="Q2YVEh2m" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6A0532074D Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=ziepe.ca Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id ED53E6B0003; Fri, 13 Mar 2020 18:52:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EAFEC6B0006; Fri, 13 Mar 2020 18:52:00 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D9E6A6B0007; Fri, 13 Mar 2020 18:52:00 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0075.hostedemail.com [216.40.44.75]) by kanga.kvack.org (Postfix) with ESMTP id C45666B0003 for ; Fri, 13 Mar 2020 18:52:00 -0400 (EDT) Received: from smtpin03.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 7B06E824556B for ; Fri, 13 Mar 2020 22:52:00 +0000 (UTC) X-FDA: 76591838400.03.spy72_85defe62e074b X-HE-Tag: spy72_85defe62e074b X-Filterd-Recvd-Size: 7644 Received: from mail-qv1-f67.google.com (mail-qv1-f67.google.com [209.85.219.67]) by imf46.hostedemail.com (Postfix) with ESMTP for ; Fri, 13 Mar 2020 22:51:59 +0000 (UTC) Received: by mail-qv1-f67.google.com with SMTP id v38so1544837qvf.6 for ; Fri, 13 Mar 2020 15:51:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ziepe.ca; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=pnEWT5exzE8VNu4uP9aWxirCjTGDlMeZrBFn4okz7EY=; b=Q2YVEh2mTyNrujOq2W42VrodkM3tb5wDoyO3GCTU9cStXcr06aCbXttMPTosmjn2uF K09UEgvt4RldFZJxIPJIFOYgCpPc4UJ9Ft2myme5EtjlQSfX5PeXhDpuEmeOQCXVX809 uRG5r/blZpuWxkNimYtvArbCVFAgPqF7oxLaifB+l1KBSCZ0iOQLyTNVsJZuQntwNkeT svCjJndRKtajjPu8wfb7nDRTUGRLJKeUawhNZQbZ1uR6C2t1hAGqZ7QAJ2XPcdMNO96L p6fVD9Y9FcQvhr4tiFJh8PvW5HEzM7O6/LEbVxsaW5N+swJVENseaveM2mFQT9e4YQXo IY4g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=pnEWT5exzE8VNu4uP9aWxirCjTGDlMeZrBFn4okz7EY=; b=KyJT192Fk2IX9SDgKNtABe85F37lheMMMAwSGXkDnO8cdoQDm7A2k8oZa515Xp+Vls 1ZG5YjiaLtxQova/0Zq7TZX0asXp4dr08UATpW9L6S6+o7X0bD0Hp4fKXO8EYdNpJgqw R4sK5SQc6TGu58imrIgvGSDaPSN5NCUaW357s7A/eGa5ZK/pPug7Rp/5BUJsk4hu8jLl EwomWlJV09EXLM2Ry90vqeZJno/QcIFJUm17lQZ++5IFIjhZKVYPzfY7fYk6CHZHu7nH x2jiO/p3PvrajTzy1xlFGghQO66IEaWeQ201vhCXQgb2zpXUUHAq3Gz19vIwgEKz4dX8 SLPQ== X-Gm-Message-State: ANhLgQ2aMv5L7/gHfeuT4yPTlpABQM6zonapENWZYvGc2YGlhhShe+5I yp8AlelHYfOPesAPBiwAeNv6rQ== X-Google-Smtp-Source: ADFU+vvXWaP3fRj/KPrE6Z5f5i6cOnkWg08gknIHvw6A2Y52E3vHJqQUV+svDxlJJrQAl4dNNJtp9A== X-Received: by 2002:a0c:fcc4:: with SMTP id i4mr14219926qvq.191.1584139919208; Fri, 13 Mar 2020 15:51:59 -0700 (PDT) Received: from ziepe.ca (hlfxns017vw-142-68-57-212.dhcp-dynamic.fibreop.ns.bellaliant.net. [142.68.57.212]) by smtp.gmail.com with ESMTPSA id u13sm29863801qtg.64.2020.03.13.15.51.58 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 13 Mar 2020 15:51:58 -0700 (PDT) Received: from jgg by mlx.ziepe.ca with local (Exim 4.90_1) (envelope-from ) id 1jCt9p-0005N0-T2; Fri, 13 Mar 2020 19:51:57 -0300 Date: Fri, 13 Mar 2020 19:51:57 -0300 From: Jason Gunthorpe To: Matthew Wilcox Cc: Steven Price , Jerome Glisse , Ralph Campbell , "Felix.Kuehling@amd.com" , Philip Yang , John Hubbard , "amd-gfx@lists.freedesktop.org" , "linux-mm@kvack.org" , "dri-devel@lists.freedesktop.org" , Christoph Hellwig Subject: Re: [PATCH] mm/hmm: Simplify hmm_vma_walk_pud slightly Message-ID: <20200313225157.GA20412@ziepe.ca> References: <5bd778fa-51e5-3e0c-d9bb-b38539b03c8d@arm.com> <20200312102813.56699-1-steven.price@arm.com> <20200312142749.GM31668@ziepe.ca> <58e296a6-d32b-bb37-28ce-ade0f784454d@arm.com> <20200312151113.GO31668@ziepe.ca> <689d3c56-3d19-4655-21f5-f9aeab3089df@arm.com> <20200312163734.GR31668@ziepe.ca> <20200313195550.GH31668@ziepe.ca> <20200313210446.GP22433@bombadil.infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200313210446.GP22433@bombadil.infradead.org> User-Agent: Mutt/1.9.4 (2018-02-28) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Mar 13, 2020 at 02:04:46PM -0700, Matthew Wilcox wrote: > On Fri, Mar 13, 2020 at 04:55:50PM -0300, Jason Gunthorpe wrote: > > On Thu, Mar 12, 2020 at 05:02:18PM +0000, Steven Price wrote: > > > On 12/03/2020 16:37, Jason Gunthorpe wrote: > > > > On Thu, Mar 12, 2020 at 04:16:33PM +0000, Steven Price wrote: > > > > > > Actually, while you are looking at this, do you think we should be > > > > > > adding at least READ_ONCE in the pagewalk.c walk_* functions? The > > > > > > multiple references of pmd, pud, etc without locking seems sketchy to > > > > > > me. > > > > > > > > > > I agree it seems worrying. I'm not entirely sure whether the holding of > > > > > mmap_sem is sufficient, > > > > > > > > I looked at this question, and at least for PMD, mmap_sem is not > > > > sufficient. I didn't easilly figure it out for the other ones > > > > > > > > I'm guessing if PMD is not safe then none of them are. > > > > > > > > > this isn't something that I changed so I've just > > > > > been hoping that it's sufficient since it seems to have been working > > > > > (whether that's by chance because the compiler didn't generate multiple > > > > > reads I've no idea). For walking the kernel's page tables the lack of > > > > > READ_ONCE is also not great, but at least for PTDUMP we don't care too much > > > > > about accuracy and it should be crash proof because there's no RCU grace > > > > > period. And again the code I was replacing didn't have any special > > > > > protection. > > > > > > > > > > I can't see any harm in updating the code to include READ_ONCE and I'm happy > > > > > to review a patch. > > > > > > > > The reason I ask is because hmm's walkers often have this pattern > > > > where they get the pointer and then de-ref it (again) then > > > > immediately have to recheck the 'again' conditions of the walker > > > > itself because the re-read may have given a different value. > > > > > > > > Having the walker deref the pointer and pass the value it into the ops > > > > for use rather than repeatedly de-refing an unlocked value seems like > > > > a much safer design to me. > > > > > > Yeah that sounds like a good idea. > > > > I'm looking at this now.. The PUD is also changing under the read > > mmap_sem - and I was able to think up some race conditiony bugs > > related to this. Have some patches now.. > > > > However, I haven't been able to understand why walk_page_range() > > doesn't check pud_present() or pmd_present() before calling > > pmd_offset_map() or pte_offset_map(). > > > > As far as I can see a non-present entry has a swap entry encoded in > > it, and thus it seems like it is a bad idea to pass a non-present > > entry to the two map functions. I think those should only be called > > when the entry points to the next level in the page table (so there > > is something to map?) > > > > I see you added !present tests for the !vma case, but why only there? > > > > Is this a bug? Do you know how it works? > > > > Is it something that was missed when people added non-present PUD and > > PMD's? > > ... I'm sorry, I did what now? No, no, just widening to see if someone knows > As far as I can tell, you're talking > about mm/pagewalk.c, and the only commit I have in that file is > a00cc7d9dd93d66a3fb83fc52aa57a4bec51c517 ("mm, x86: add support for > PUD-sized transparent hugepages", which I think I was pretty clear > from the commit message is basically copy-and-paste from the PMD > code. Right, which added the split_huge_pud() which seems maybe related to pud_present, or maybe not, I don't know. > I have no clue why most of the decisions in the MM were made. Fun! Jason