linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Seema Pandit <seema.pandit@intel.com>,
	linux-nvdimm <linux-nvdimm@lists.01.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	stable <stable@vger.kernel.org>,
	Robert Barror <robert.barror@intel.com>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Jan Kara <jack@suse.cz>
Subject: Re: [PATCH] filesystem-dax: Disable PMD support
Date: Mon, 1 Jul 2019 20:34:10 -0700	[thread overview]
Message-ID: <20190702033410.GB1729@bombadil.infradead.org> (raw)
In-Reply-To: <CAPcyv4j2NBPBEUU3UW1Q5OyOEuo9R5e90HpkowpeEkMsAKiUyQ@mail.gmail.com>

On Sun, Jun 30, 2019 at 02:37:32PM -0700, Dan Williams wrote:
> On Sun, Jun 30, 2019 at 8:23 AM Matthew Wilcox <willy@infradead.org> wrote:
> > I think my theory was slightly mistaken, but your fix has the effect of
> > fixing the actual problem too.
> >
> > The xas->xa_index for a PMD is going to be PMD-aligned (ie a multiple of
> > 512), but xas_find_conflict() does _not_ adjust xa_index (... which I
> > really should have mentioned in the documentation).  So we go to sleep
> > on the PMD-aligned index instead of the index of the PTE.  Your patch
> > fixes this by using the PMD-aligned index for PTEs too.
> >
> > I'm trying to come up with a clean fix for this.  Clearly we
> > shouldn't wait for a PTE entry if we're looking for a PMD entry.
> > But what should get_unlocked_entry() return if it detects that case?
> > We could have it return an error code encoded as an internal entry,
> > like grab_mapping_entry() does.  Or we could have it return the _locked_
> > PTE entry, and have callers interpret that.
> >
> > At least get_unlocked_entry() is static, but it's got quite a few callers.
> > Trying to discern which ones might ask for a PMD entry is a bit tricky.
> > So this seems like a large patch which might have bugs.
> >
> > Thoughts?
> 
> ...but if it was a problem of just mismatched waitqueue's I would have
> expected it to trigger prior to commit b15cd800682f "dax: Convert page
> fault handlers to XArray".

That commit converts grab_mapping_entry() (called by dax_iomap_pmd_fault())
from calling get_unlocked_mapping_entry() to calling get_unlocked_entry().
get_unlocked_mapping_entry() (eventually) called __radix_tree_lookup()
instead of dax_find_conflict().

> This hunk, if I'm reading it correctly,
> looks suspicious: @index in this case is coming directly from
> vm->pgoff without pmd alignment adjustment whereas after the
> conversion it's always pmd aligned from the xas->xa_index. So perhaps
> the issue is that the lock happens at pte granularity. I expect it
> would cause the old put_locked_mapping_entry() to WARN, but maybe that
> avoids the lockup and was missed in the bisect.

I don't think that hunk is the problem.  The __radix_tree_lookup()
is going to return a 'slot' which points to the canonical slot, no
matter which of the 512 indices corresponding to that slot is chosen.
So I think it's going to do essentially the same thing.

> @@ -884,21 +711,18 @@ static void *dax_insert_entry(struct
> address_space *mapping,
>                  * existing entry is a PMD, we will just leave the PMD in the
>                  * tree and dirty it if necessary.
>                  */
> -               struct radix_tree_node *node;
> -               void **slot;
> -               void *ret;
> -
> -               ret = __radix_tree_lookup(pages, index, &node, &slot);
> -               WARN_ON_ONCE(ret != entry);
> -               __radix_tree_replace(pages, node, slot,
> -                                    new_entry, NULL);
> +               void *old = dax_lock_entry(xas, new_entry);
> +               WARN_ON_ONCE(old != xa_mk_value(xa_to_value(entry) |
> +                                       DAX_LOCKED));
>                 entry = new_entry;
> +       } else {
> +               xas_load(xas);  /* Walk the xa_state */
>         }
> 
>         if (dirty)
> -               radix_tree_tag_set(pages, index, PAGECACHE_TAG_DIRTY);
> +               xas_set_mark(xas, PAGECACHE_TAG_DIRTY);
> 
> -       xa_unlock_irq(pages);
> +       xas_unlock_irq(xas);
>         return entry;
>  }

  reply	other threads:[~2019-07-02  3:34 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-27  0:15 [PATCH] filesystem-dax: Disable PMD support Dan Williams
2019-06-27 12:34 ` Matthew Wilcox
2019-06-27 16:06   ` Dan Williams
2019-06-27 18:29     ` Dan Williams
2019-06-27 18:58       ` Dan Williams
2019-06-27 19:09         ` Dan Williams
2019-06-27 19:59           ` Matthew Wilcox
2019-06-28  2:39             ` Dan Williams
2019-06-28 16:37               ` Matthew Wilcox
2019-06-28 16:39                 ` Dan Williams
2019-06-28 16:54                   ` Matthew Wilcox
2019-06-29 16:03               ` Matthew Wilcox
2019-06-30  7:27                 ` Dan Williams
2019-06-30  8:01                   ` Dan Williams
2019-06-30 15:23                     ` Matthew Wilcox
2019-06-30 21:37                       ` Dan Williams
2019-07-02  3:34                         ` Matthew Wilcox [this message]
2019-07-02 15:37                           ` Dan Williams
2019-07-03  0:22                             ` Boaz Harrosh
2019-07-03  0:42                               ` Dan Williams
2019-07-03  1:39                                 ` Boaz Harrosh
2019-07-01 12:11                       ` Jan Kara
2019-07-03 15:47                         ` Matthew Wilcox
2019-07-04 16:40                           ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190702033410.GB1729@bombadil.infradead.org \
    --to=willy@infradead.org \
    --cc=dan.j.williams@intel.com \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=robert.barror@intel.com \
    --cc=seema.pandit@intel.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).