linux-nvdimm.lists.01.org archive mirror
 help / color / mirror / Atom feed
From: "Kani, Toshimitsu" <toshi.kani@hpe.com>
To: "ross.zwisler@linux.intel.com" <ross.zwisler@linux.intel.com>,
	"jack@suse.cz" <jack@suse.cz>
Cc: "linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>
Subject: Re: Infinite loop with DAX PMD faults
Date: Fri, 28 Oct 2016 13:51:30 +0000	[thread overview]
Message-ID: <1477662579.20881.108.camel@hpe.com> (raw)
In-Reply-To: <20161028081759.GD30952@quack2.suse.cz>

On Fri, 2016-10-28 at 10:17 +0200, Jan Kara wrote:
> On Thu 27-10-16 22:13:00, Ross Zwisler wrote:
> > 
> > On Thu, Oct 27, 2016 at 09:48:41PM +0000, Kani, Toshimitsu wrote:
> > > 
> > > On Thu, 2016-10-27 at 15:03 -0600, Ross Zwisler wrote:
> > > > 
> > > > On Thu, Oct 27, 2016 at 12:46:32PM -0700, Dan Williams wrote:
> > > > > 
> > > > > 
> > > > > On Thu, Oct 27, 2016 at 12:07 PM, Jan Kara <jack@suse.cz>
> > > > > wrote:
> > > > > > 
> > > > > > 
> > > > > > Hello,
> > > > > > 
> > > > > > When testing my DAX patches rebased on top of Ross' DAX PMD
> > > > > > series, I've come across the following issue with
> > > > > > generic/344 test from xfstests. The test ends in an
> > > > > > infinite fault loop when we fault index 0 over and over
> > > > > > again never finishing the fault. The problem is that we do
> > > > > > a write fault for index 0 when there is PMD for that index.
> > > > > > So we enter wp_huge_pmd(). For whatever reason that returns
> > > > > > VM_FAULT_FALLBACK so we continue to handle_pte_fault().
> > > > > > There we do
> > > > > > 
> > > > > >         if (pmd_trans_unstable(vmf->pmd) ||
> > > > > > pmd_devmap(*vmf-
> > > > > > > 
> > > > > > > pmd))
> > > > > > 
> > > > > > check which is true - the PMD we have is pmd_trans_huge() -
> > > > > > so we 'return 0' and that results in retrying the fault and
> > > > > > all happens from the beginning again.
> > > > > > 
> > > > > > It isn't quite obvious how to break that cycle to me. The
> > > > > > comment before pmd_none_or_trans_huge_or_clear_bad() goes
> > > > > > to great lengths explaining possible races when PMD is
> > > > > > pmd_trans_huge() so it needs careful evaluation what needs
> > > > > > to be done for DAX. Ross, any idea?
> > > > > 
> > > > > Can you bisect it with CONFIG_BROKEN removed from older
> > > > > kernels?
> > > > > 
> > > > > I remember tracking down something like this when initially
> > > > > doing the pmd support.  It ended up being a missed
> > > > > pmd_devmap() check in the fault path, so it may not be the
> > > > > same issue.  It would at least be interesting to see if 4.6
> > > > > fails in a similar manner with this test and FS_DAX_PMD
> > > > > enabled.
> > > > 
> > > > I've been able to reproduce this with my v4.9-rc2 branch, but
> > > > it doesn't reproduce with the old v4.6 kernel.
> > > 
> > > Not sure if it's relevant, but as FYI I fixed a similar issue
> > > before.
> > > 
> > > commit 59bf4fb9d386601cbaa70a9b00159abb846dedaa
> > > dax: Split pmd map when fallback on COW
> > > 
> > > -Toshi
> > 
> > Thanks!  Applying a similar patch solves this
> > deadlock.  Unfortunately I don't (yet?) understand this well enough
> > to say whether this is the correct solution, but it makes
> > generic/344 + PMDs pass. :)
> > 
> > Does anyone with more mm knowledge have time to review?
> 
> I'm not really much into huge pages but AFAICT that should fix the
> problem. I'm just not sure whether in other cases when we return
> VM_FAULT_FALLBACK we don't need something similar. Probably this will
> need some experiments
> ;).

Good to know it worked. :-) I think we need to split a pmd mapping in
the case of COW fallback. The pte handler may not proceed when a pmd
mapping is still in-place.

Thanks,
-Toshi

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

  reply	other threads:[~2016-10-28 13:51 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-27 19:07 Infinite loop with DAX PMD faults Jan Kara
2016-10-27 19:46 ` Dan Williams
2016-10-27 21:03   ` Ross Zwisler
2016-10-27 21:48     ` Kani, Toshimitsu
2016-10-28  4:13       ` Ross Zwisler
2016-10-28  8:17         ` Jan Kara
2016-10-28 13:51           ` Kani, Toshimitsu [this message]
2016-10-28  8:12   ` Jan Kara
2016-10-27 19:54 ` Ross Zwisler
2016-10-28  8:02   ` Jan Kara
2016-10-28 15:35     ` Ross Zwisler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1477662579.20881.108.camel@hpe.com \
    --to=toshi.kani@hpe.com \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=ross.zwisler@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).