From: Ross Zwisler <ross.zwisler@linux.intel.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: linux-fsdevel <linux-fsdevel@vger.kernel.org>,
Jan Kara <jack@suse.cz>,
"linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>
Subject: Re: Infinite loop with DAX PMD faults
Date: Thu, 27 Oct 2016 15:03:43 -0600 [thread overview]
Message-ID: <20161027210343.GA12217@linux.intel.com> (raw)
In-Reply-To: <CAPcyv4gS3LwAq77t31aRCxuHu9UeC4axJbQmxS-JQJzBVV=GmQ@mail.gmail.com>
On Thu, Oct 27, 2016 at 12:46:32PM -0700, Dan Williams wrote:
> On Thu, Oct 27, 2016 at 12:07 PM, Jan Kara <jack@suse.cz> wrote:
> > Hello,
> >
> > When testing my DAX patches rebased on top of Ross' DAX PMD series, I've
> > come across the following issue with generic/344 test from xfstests. The
> > test ends in an infinite fault loop when we fault index 0 over and over
> > again never finishing the fault. The problem is that we do a write fault
> > for index 0 when there is PMD for that index. So we enter wp_huge_pmd().
> > For whatever reason that returns VM_FAULT_FALLBACK so we continue to
> > handle_pte_fault(). There we do
> >
> > if (pmd_trans_unstable(vmf->pmd) || pmd_devmap(*vmf->pmd))
> >
> > check which is true - the PMD we have is pmd_trans_huge() - so we 'return
> > 0' and that results in retrying the fault and all happens from the
> > beginning again.
> >
> > It isn't quite obvious how to break that cycle to me. The comment before
> > pmd_none_or_trans_huge_or_clear_bad() goes to great lengths explaining
> > possible races when PMD is pmd_trans_huge() so it needs careful evaluation
> > what needs to be done for DAX. Ross, any idea?
>
> Can you bisect it with CONFIG_BROKEN removed from older kernels?
>
> I remember tracking down something like this when initially doing the
> pmd support. It ended up being a missed pmd_devmap() check in the
> fault path, so it may not be the same issue. It would at least be
> interesting to see if 4.6 fails in a similar manner with this test and
> FS_DAX_PMD enabled.
I've been able to reproduce this with my v4.9-rc2 branch, but it doesn't
reproduce with the old v4.6 kernel.
My guess is that this might be because in the old v4.6 kernel, PMD faults
don't actually work most of the time because most users don't pass an 2MiB
aligned address to mmap. This was fixed by Toshi's patches:
dbe6ec8 ext2/4, xfs: call thp_get_unmapped_area() for pmd mappings
74d2fad thp, dax: add thp_get_unmapped_area for pmd mappings
Anyway, I'm off to try and understand this failure more deeply.
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm
next prev parent reply other threads:[~2016-10-27 21:03 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-10-27 19:07 Infinite loop with DAX PMD faults Jan Kara
2016-10-27 19:46 ` Dan Williams
2016-10-27 21:03 ` Ross Zwisler [this message]
2016-10-27 21:48 ` Kani, Toshimitsu
2016-10-28 4:13 ` Ross Zwisler
2016-10-28 8:17 ` Jan Kara
2016-10-28 13:51 ` Kani, Toshimitsu
2016-10-28 8:12 ` Jan Kara
2016-10-27 19:54 ` Ross Zwisler
2016-10-28 8:02 ` Jan Kara
2016-10-28 15:35 ` Ross Zwisler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20161027210343.GA12217@linux.intel.com \
--to=ross.zwisler@linux.intel.com \
--cc=dan.j.williams@intel.com \
--cc=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-nvdimm@lists.01.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).