From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C19D5EB64DA for ; Wed, 19 Jul 2023 06:37:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:From:References:Cc:To: Subject:MIME-Version:Date:Message-ID:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=nPOSVgiElOITn8K8CA49ohYtPB25599/qqCmjC46DFI=; b=B7CozPwz9ukH3J +aTkr+Y/2g9W6OJe7Z8fkCK4aNj9/aS7qKrBUSp9Il6XLtY8L6zfHw5i8yJYkguLo22gFDqo+sDbi lkWjP0Cym+pM9Sdz6mYFCRhNCETYXcAZyKSv59KAJqOJ1QT7/2OWOhszMDflgnC9+sa/MiWqK9bt0 uZQiM2EAFszjBOfRDvct5t61KzyYOy0WkpyIT61l75xSdCziemy5jAGplNDnm0bJHY4BynLj1gVi3 pTWJdowXFryN8We3Bj3DY3mOey1DFK7jhMWxY0Ok8Gd9gZvMhpUBSiGSVmj0ncLTshfx7dvJX9bLG 6Fcxf8e8dC2IytVFSAOw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qM0nq-005oDE-1n; Wed, 19 Jul 2023 06:36:50 +0000 Received: from desiato.infradead.org ([2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1qM0np-005oCa-1X for linux-arm-kernel@bombadil.infradead.org; Wed, 19 Jul 2023 06:36:49 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Transfer-Encoding:Content-Type :In-Reply-To:From:References:Cc:To:Subject:MIME-Version:Date:Message-ID: Sender:Reply-To:Content-ID:Content-Description; bh=Z8LRS8Zyl8jRX9FEmAU4P8S/gmJzvPYSBAkjI/m59xI=; b=B/sethgUOA7/XdVrkn6AZ+Lemd Zb4WG/WjJ+FOQ1L4kGGuLzMWgJxkciTzSnMFdRgJsWFK+u4xN2emWREIuTvtMT5ndbqipsHIKtDi/ zy/Ru+ImIb6KGE79P0C44RNbp/TDXL7FAutEH0pxdoo169pRkhqSEruS6TVZqKHqnr2yarZfwKfc1 Ivyn7B6Hr7J96yTVHudNsOKcblF0jm0zhZyhAW9drt3jsTbuk3yoe71tCIn4eI5WMvWpab5gAsJyP Bu0mYXCQ8dNcw396xHgMxI+yS+zuIkj5DSK9pE8dXtavTdDknudN2voxYpUTTjIWpqKFA0TxkDegI Eg+DcC6A==; Received: from foss.arm.com ([217.140.110.172]) by desiato.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qLi4T-00BPqg-2p for linux-arm-kernel@lists.infradead.org; Tue, 18 Jul 2023 10:36:48 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B3D622F4; Tue, 18 Jul 2023 03:36:56 -0700 (PDT) Received: from [10.1.34.52] (C02Z41KALVDN.cambridge.arm.com [10.1.34.52]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id F153A3F67D; Tue, 18 Jul 2023 03:36:10 -0700 (PDT) Message-ID: Date: Tue, 18 Jul 2023 11:36:09 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.13.0 Subject: Re: [PATCH v3 3/4] mm: FLEXIBLE_THP for improved performance To: Hugh Dickins Cc: Yu Zhao , Matthew Wilcox , Andrew Morton , "Kirill A. Shutemov" , Yin Fengwei , David Hildenbrand , Catalin Marinas , Will Deacon , Anshuman Khandual , Yang Shi , "Huang, Ying" , Zi Yan , Luis Chamberlain , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org References: <20230714160407.4142030-1-ryan.roberts@arm.com> <20230714161733.4144503-3-ryan.roberts@arm.com> <432490d1-8d1e-1742-295a-d6e60a054ab6@arm.com> <5df787a0-8e69-2472-cdd6-f96a3f7dfaaf@arm.com> <8bdfd8d8-5662-4615-86dc-d60259bd16d@google.com> From: Ryan Roberts In-Reply-To: <8bdfd8d8-5662-4615-86dc-d60259bd16d@google.com> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230718_113646_294474_1567AEEF X-CRM114-Status: GOOD ( 27.49 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On 18/07/2023 00:37, Hugh Dickins wrote: > On Mon, 17 Jul 2023, Ryan Roberts wrote: > >>>>>> +static int alloc_anon_folio(struct vm_fault *vmf, struct folio **folio) >>>>>> +{ >>>>>> + int i; >>>>>> + gfp_t gfp; >>>>>> + pte_t *pte; >>>>>> + unsigned long addr; >>>>>> + struct vm_area_struct *vma = vmf->vma; >>>>>> + int prefer = anon_folio_order(vma); >>>>>> + int orders[] = { >>>>>> + prefer, >>>>>> + prefer > PAGE_ALLOC_COSTLY_ORDER ? PAGE_ALLOC_COSTLY_ORDER : 0, >>>>>> + 0, >>>>>> + }; >>>>>> + >>>>>> + *folio = NULL; >>>>>> + >>>>>> + if (vmf_orig_pte_uffd_wp(vmf)) >>>>>> + goto fallback; >>>>>> + >>>>>> + for (i = 0; orders[i]; i++) { >>>>>> + addr = ALIGN_DOWN(vmf->address, PAGE_SIZE << orders[i]); >>>>>> + if (addr >= vma->vm_start && >>>>>> + addr + (PAGE_SIZE << orders[i]) <= vma->vm_end) >>>>>> + break; >>>>>> + } >>>>>> + >>>>>> + if (!orders[i]) >>>>>> + goto fallback; >>>>>> + >>>>>> + pte = pte_offset_map(vmf->pmd, vmf->address & PMD_MASK); >>>>>> + if (!pte) >>>>>> + return -EAGAIN; >>>>> >>>>> It would be a bug if this happens. So probably -EINVAL? >>>> >>>> Not sure what you mean? Hugh Dickins' series that went into v6.5-rc1 makes it >>>> possible for pte_offset_map() to fail (if I understood correctly) and we have to >>>> handle this. The intent is that we will return from the fault without making any >>>> change, then we will refault and try again. >>> >>> Thanks for checking that -- it's very relevant. One detail is that >>> that series doesn't affect anon. IOW, collapsing PTEs into a PMD can't >>> happen while we are holding mmap_lock for read here, and therefore, >>> the race that could cause pte_offset_map() on shmem/file PTEs to fail >>> doesn't apply here. >> >> But Hugh's patches have changed do_anonymous_page() to handle failure from >> pte_offset_map_lock(). So I was just following that pattern. If this really >> can't happen, then I'd rather WARN/BUG on it, and simplify alloc_anon_folio()'s >> prototype to just return a `struct folio *` (and if it's null that means ENOMEM). >> >> Hugh, perhaps you can comment? > > I agree with your use of -EAGAIN there: I find it better to allow for the > possibility, than to go to great effort persuading that it's impossible; > especially because what's possible tomorrow may differ from today. > > And notice that, before my changes, there used to be a pmd_trans_unstable() > check above, implying that it is possible for it to fail (for more reasons > than corruption causing pmd_bad()) - one scenario would be that the > pte_alloc() above succeeded *because* someone else had managed to insert > a huge pmd there already (maybe we have MMF_DISABLE_THP but they did not). > > But I see from later mail that Yu Zhao now agrees with your -EAGAIN too, > so we are all on the same folio. Thanks for the explanation. I think we are all now agreed that the error case needs handling and -EAGAIN is the correct code. > > Hugh > > p.s. while giving opinions, I'm one of those against using "THP" for > large but not pmd-mappable folios; and was glad to see Matthew arguing > the same way when considering THP_SWPOUT in another thread today. Honestly, I don't have an opinion either way on this (probably because I don't have the full context and history of THP like many of you do). So given there is a fair bit of opposition to FLEXIBLE_THP, I'll change it back to LARGE_ANON_FOLIO (and move it out of the THP sub-menu) in the next revision. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel