Linux CXL
 help / color / mirror / Atom feed
From: Gregory Price <gregory.price@memverge.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: linux-cxl@vger.kernel.org, Dave Jiang <dave.jiang@intel.com>
Subject: Re: [BUG] DAX access of Memory Expander on RCH topology fires BUG on page_table_check
Date: Wed, 19 Apr 2023 21:29:51 -0400	[thread overview]
Message-ID: <ZECVjxh/Xgwa4Tpi@memverge.com> (raw)
In-Reply-To: <643e3a2344460_556e294a2@dwillia2-mobl3.amr.corp.intel.com.notmuch>

On Mon, Apr 17, 2023 at 11:35:15PM -0700, Dan Williams wrote:
> Gregory Price wrote:
> > Now map and access the memory via /dev/dax0.0  (test program attached)
> > 
> > [ 1028.430734] kernel BUG at mm/page_table_check.c:53!
> 
> I have never tested DAX with CONFIG_PAGE_TABLE_CHECK=y, so would need to
> dig in further here. A quick test passes the unit tests, but the unit
> tests don't have this, "map dax after system-ram" scenario. Just for
> completenees, does it behave without that debug option enabled?
> 

Confirmed passes without issues when this debug option is disabled.
Also confirmed on production hardware with a release build where this
check is disabled.

So something is up with page table check code and going numa to dax.

> 
> i.e. just touching the memory fails, no need to mlock it? This smells
> more like the CONFIG_PAGE_TABLE_CHECK machinery is getting confused, but
> I would have expected its metadata to be reset by the dax device
> reconfiguration.

Yes, just touching is faults, without mlocking it.

I dug in and the page_ext for the page is NULL, which is what causes the
BUG().  I don't know the subsystem well enough to know why converting to
dax would cause the page_ext to be NULL.

The reason why this got convoluted with the other hardware/firmware/bios
issues is that I was thinking the alignment issue with memory blocks may
have been part of the issue, but clearly that's not the case.

~Gregory

      reply	other threads:[~2023-04-20  1:30 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-12 18:43 [BUG] DAX access of Memory Expander on RCH topology fires BUG on page_table_check Gregory Price
2023-04-13 11:39 ` Gregory Price
2023-04-18  6:43   ` Dan Williams
2023-04-20  0:58     ` Gregory Price
2023-04-18  6:35 ` Dan Williams
2023-04-20  1:29   ` Gregory Price [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZECVjxh/Xgwa4Tpi@memverge.com \
    --to=gregory.price@memverge.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=linux-cxl@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox