From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932428AbdKCHyM (ORCPT ); Fri, 3 Nov 2017 03:54:12 -0400 Received: from mga14.intel.com ([192.55.52.115]:6306 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752465AbdKCHx6 (ORCPT ); Fri, 3 Nov 2017 03:53:58 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.44,337,1505804400"; d="scan'208";a="917143919" From: "Huang, Ying" To: Naoya Horiguchi , Zi Yan Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Huang Ying , Andrea Arcangeli , Mike Kravetz , Mike Rapoport , "Kirill A. Shutemov" , Alexander Viro Subject: [RFC -mm] mm, userfaultfd, THP: Avoid waiting when PMD under THP migration Date: Fri, 3 Nov 2017 15:52:31 +0800 Message-Id: <20171103075231.25416-1-ying.huang@intel.com> X-Mailer: git-send-email 2.14.2 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Huang Ying If THP migration is enabled, the following situation is possible, - A THP is mapped at source address - Migration is started to move the THP to another node - Page fault occurs - The PMD (migration entry) is copied to the destination address in mremap That is, it is possible for handle_userfault() encounter a PMD entry which has been handled but !pmd_present(). In the current implementation, we will wait for such PMD entries, which may cause unnecessary waiting, and potential soft lockup. This is fixed via avoiding to wait when !pmd_present(), only wait when pmd_none(). Question: I found userfaultfd_must_wait() is always called when PMD or PTE is none, and with mm->mmap_sem read-lock held. mremap() will write-lock mm->mmap_sem. And UFFDIO_COPY don't support to copy THP mapping. So the situation described above couldn't happen in practice? Signed-off-by: "Huang, Ying" Cc: Andrea Arcangeli Cc: Mike Kravetz Cc: Mike Rapoport Cc: "Kirill A. Shutemov" Cc: Alexander Viro Cc: Zi Yan Cc: Naoya Horiguchi --- fs/userfaultfd.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index b5a0193e1960..0fcf66c3e439 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -294,10 +294,13 @@ static inline bool userfaultfd_must_wait(struct userfaultfd_ctx *ctx, * pmd_trans_unstable) of the pmd. */ _pmd = READ_ONCE(*pmd); - if (!pmd_present(_pmd)) + if (pmd_none(_pmd)) goto out; ret = false; + if (!pmd_present(_pmd)) + goto out; + if (pmd_trans_huge(_pmd)) goto out; -- 2.14.2