From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932983AbdKFUfa (ORCPT ); Mon, 6 Nov 2017 15:35:30 -0500 Received: from mx1.redhat.com ([209.132.183.28]:45662 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753523AbdKFUf3 (ORCPT ); Mon, 6 Nov 2017 15:35:29 -0500 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 6E6B2BDEA Authentication-Results: ext-mx05.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx05.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=aarcange@redhat.com Date: Mon, 6 Nov 2017 21:35:27 +0100 From: Andrea Arcangeli To: Zi Yan Cc: huang ying , "Huang, Ying" , Naoya Horiguchi , linux-mm@kvack.org, LKML , Mike Kravetz , Mike Rapoport , "Kirill A. Shutemov" , Alexander Viro Subject: Re: [RFC -mm] mm, userfaultfd, THP: Avoid waiting when PMD under THP migration Message-ID: <20171106203527.GB26645@redhat.com> References: <20171103075231.25416-1-ying.huang@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.1 (2017-09-22) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.29]); Mon, 06 Nov 2017 20:35:29 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Nov 06, 2017 at 10:53:48AM -0500, Zi Yan wrote: > Thanks for clarifying it. We both agree that !pmd_present(), which means > PMD migration entry, does not get into userfaultfd_must_wait(), > then there seems to be no issue with current code yet. > > However, the if (!pmd_present(_pmd)) in userfaultfd_must_wait() does not > match > the exact condition. How about the patch below? It can catch pmd > migration entries, > which are only possible in x86_64 at the moment. > > diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c > index 1c713fd5b3e6..dda25444a6ee 100644 > --- a/fs/userfaultfd.c > +++ b/fs/userfaultfd.c > @@ -294,9 +294,11 @@ static inline bool userfaultfd_must_wait(struct > userfaultfd_ctx *ctx, > * pmd_trans_unstable) of the pmd. > */ > _pmd = READ_ONCE(*pmd); > - if (!pmd_present(_pmd)) > + if (pmd_none(_pmd)) > goto out; > > + VM_BUG_ON(thp_migration_supported() && is_pmd_migration_entry(_pmd)); > + As I wrote in prev email I'm not sure about this invariant to be correct 100% of the time (plus we'd want a VM_WARN_ON only here). Specifically, what does prevent try_to_unmap to run on a THP backed mapping with only the mmap_sem for reading? I know what prevents to ever reproduce this in practice though (aside from the fact the race between the is_swap_pmd() check in the main page fault and the above check is small) and it's because compaction won't migrate THP and even the numa faults will not use the migration entry. So it'd require some more explicit migration numactl while userfaults are running to ever see an hang in there. I think it's a regression since the introduction of THP migration around commits 84c3fc4e9c563d8fb91cfdf5948da48fe1af34d3 / 616b8371539a6c487404c3b8fb04078016dab4ba / 9c670ea37947a82cb6d4df69139f7e46ed71a0ac etc.. before that pmd_none or !pmd_present used to be equivalent, not the case any longer. Of course pmd_none would have been better before too. Thanks, Andrea