From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4DB00C001E0 for ; Wed, 2 Aug 2023 15:37:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235052AbjHBPhS (ORCPT ); Wed, 2 Aug 2023 11:37:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56586 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233956AbjHBPgp (ORCPT ); Wed, 2 Aug 2023 11:36:45 -0400 Received: from outbound-smtp06.blacknight.com (outbound-smtp06.blacknight.com [81.17.249.39]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AB1003AA2 for ; Wed, 2 Aug 2023 08:36:06 -0700 (PDT) Received: from mail.blacknight.com (pemlinmail04.blacknight.ie [81.17.254.17]) by outbound-smtp06.blacknight.com (Postfix) with ESMTPS id DF381C2BAA for ; Wed, 2 Aug 2023 16:28:49 +0100 (IST) Received: (qmail 12428 invoked from network); 2 Aug 2023 15:28:49 -0000 Received: from unknown (HELO techsingularity.net) (mgorman@techsingularity.net@[84.203.20.191]) by 81.17.254.9 with ESMTPSA (AES256-SHA encrypted, authenticated); 2 Aug 2023 15:28:49 -0000 Date: Wed, 2 Aug 2023 16:28:47 +0100 From: Mel Gorman To: David Hildenbrand Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, kvm@vger.kernel.org, linux-kselftest@vger.kernel.org, Andrew Morton , Linus Torvalds , liubo , Peter Xu , Matthew Wilcox , Hugh Dickins , Jason Gunthorpe , John Hubbard , Mel Gorman , Shuah Khan , Paolo Bonzini Subject: Re: [PATCH v2 4/8] mm/gup: don't implicitly set FOLL_HONOR_NUMA_FAULT Message-ID: <20230802152847.c3pz5o4pfsmkuv3u@techsingularity.net> References: <20230801124844.278698-1-david@redhat.com> <20230801124844.278698-5-david@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <20230801124844.278698-5-david@redhat.com> Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org On Tue, Aug 01, 2023 at 02:48:40PM +0200, David Hildenbrand wrote: > Commit 0b9d705297b2 ("mm: numa: Support NUMA hinting page faults from > gup/gup_fast") from 2012 documented as the primary reason why we would want > to handle NUMA hinting faults from GUP: > > KVM secondary MMU page faults will trigger the NUMA hinting page > faults through gup_fast -> get_user_pages -> follow_page -> > handle_mm_fault. > > That is still the case today, and relevant KVM code has been converted to > manually set FOLL_HONOR_NUMA_FAULT. So let's stop setting > FOLL_HONOR_NUMA_FAULT for all GUP users and cross fingers that not that > many other ones that really require such handling for autonuma remain. > > Possible interaction with MMU notifiers: > > Assume a driver obtains a page using get_user_pages() to map it into > a secondary MMU, and uses the MMU notifier framework to get notified on > changes. > > Assume get_user_pages() succeeded on a PROT_NONE-mapped page (because > FOLL_HONOR_NUMA_FAULT is not set) in an accessible VMA and the page is > mapped into a secondary MMU. Once user space would turn that mapping > inaccessible using mprotect(PROT_NONE), the actual PTE in the page table > might not change. If the MMU notifier would be smart and optimize for that > case "why notify if the PTE didn't change", that could be problematic. > > At least change_pmd_range() with MMU_NOTIFY_PROTECTION_VMA for now does an > unconditional mmu_notifier_invalidate_range_start() -> > mmu_notifier_invalidate_range_end() and should be fine. > > Note that even if a PTE in an accessible VMA is pte_protnone(), the > underlying page might be accessed by a secondary MMU that does not set > FOLL_HONOR_NUMA_FAULT, and test_young() MMU notifiers would return "true". > > Signed-off-by: David Hildenbrand Also seems sane but a large portion of its correctness also depends on patch 3 being correct. -- Mel Gorman SUSE Labs