From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752755AbdKIHdm (ORCPT ); Thu, 9 Nov 2017 02:33:42 -0500 Received: from mga11.intel.com ([192.55.52.93]:11096 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751117AbdKIHdl (ORCPT ); Thu, 9 Nov 2017 02:33:41 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.44,368,1505804400"; d="scan'208";a="171503167" From: "Huang\, Ying" To: Andrea Arcangeli Cc: huang ying , Zi Yan , "Huang\, Ying" , Naoya Horiguchi , , LKML , Mike Kravetz , "Mike Rapoport" , "Kirill A. Shutemov" , Alexander Viro Subject: Re: [RFC -mm] mm, userfaultfd, THP: Avoid waiting when PMD under THP migration References: <20171103075231.25416-1-ying.huang@intel.com> <20171106202148.GA26645@redhat.com> Date: Thu, 09 Nov 2017 15:33:37 +0800 In-Reply-To: <20171106202148.GA26645@redhat.com> (Andrea Arcangeli's message of "Mon, 6 Nov 2017 21:21:48 +0100") Message-ID: <87d14rzz1q.fsf@yhuang-dev.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=ascii Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Andrea Arcangeli writes: > Hello, > > On Sun, Nov 05, 2017 at 11:01:05AM +0800, huang ying wrote: >> On Fri, Nov 3, 2017 at 11:00 PM, Zi Yan wrote: >> > On 3 Nov 2017, at 3:52, Huang, Ying wrote: >> > >> >> From: Huang Ying >> >> >> >> If THP migration is enabled, the following situation is possible, >> >> >> >> - A THP is mapped at source address >> >> - Migration is started to move the THP to another node >> >> - Page fault occurs >> >> - The PMD (migration entry) is copied to the destination address in mremap >> >> >> > >> > You mean the page fault path follows the source address and sees pmd_none() now >> > because mremap() clears it and remaps the page with dest address. >> > Otherwise, it seems not possible to get into handle_userfault(), since it is called in >> > pmd_none() branch inside do_huge_pmd_anonymous_page(). >> > >> > >> >> That is, it is possible for handle_userfault() encounter a PMD entry >> >> which has been handled but !pmd_present(). In the current >> >> implementation, we will wait for such PMD entries, which may cause >> >> unnecessary waiting, and potential soft lockup. >> > >> > handle_userfault() should only see pmd_none() in the situation you describe, >> > whereas !pmd_present() (migration entry case) should lead to >> > pmd_migration_entry_wait(). >> >> Yes. This is my understanding of the source code too. And I >> described it in the original patch description too. I just want to >> make sure whether it is possible that !pmd_none() and !pmd_present() >> for a PMD in userfaultfd_must_wait(). And, whether it is possible for > > I don't see how mremap is relevant above. mremap runs with mmap_sem > for writing, so it can't race against userfaultfd_must_wait. > > However the concern of set_pmd_migration_entry() being called with > only the mmap_sem for reading through TTU_MIGRATION in > __unmap_and_move and being interpreted as a "missing" THP page by > userfaultfd_must_wait seems valid. > > Compaction won't normally compact pages that are already THP sized so > you cannot see this normally because VM don't normally get migrated > over SHM/hugetlbfs with hard bindings while userfaults are in > progress. > > Overall your patch looks more correct than current code so it's good > idea to apply and it should avoid surprises with the above corner > case if CONFIG_ARCH_ENABLE_THP_MIGRATION is set. > > Worst case the process would hang in handle_userfault(), but it will > still respond fine to sigkill, so it's not concerning, but it should > be fixed nevertheless. > > Reviewed-by: Andrea Arcangeli Thanks! I will revise the patch description and send the new version! Best Regards, Huang, Ying [snip]