From: "Darrick J. Wong" <djwong@kernel.org>
To: Shiyang Ruan <ruansy.fnst@fujitsu.com>
Cc: "hch@infradead.org" <hch@infradead.org>,
Theodore Ts'o <tytso@mit.edu>,
"toshi.kani@hpe.com" <toshi.kani@hpe.com>,
"dm-devel@redhat.com" <dm-devel@redhat.com>,
"nvdimm@lists.linux.dev" <nvdimm@lists.linux.dev>,
Brian Foster <bfoster@redhat.com>,
"yangx.jy@fujitsu.com" <yangx.jy@fujitsu.com>,
Dave Chinner <david@fromorbit.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"Yasunori Gotou \(Fujitsu\)" <y-goto@fujitsu.com>,
Jeff Moyer <jmoyer@redhat.com>,
"zwisler@kernel.org" <zwisler@kernel.org>,
"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
Dan Williams <dan.j.williams@intel.com>,
"linux-xfs@vger.kernel.org" <linux-xfs@vger.kernel.org>
Subject: Re: [dm-devel] [PATCH] xfs: fail dax mount if reflink is enabled on a partition
Date: Tue, 1 Nov 2022 17:45:07 -0700 [thread overview]
Message-ID: <Y2G9k9/XJVQ7yiWN@magnolia> (raw)
In-Reply-To: <7a3aac47-1492-a3cc-c53a-53c908f4f857@fujitsu.com>
On Sun, Oct 30, 2022 at 05:31:43PM +0800, Shiyang Ruan wrote:
>
>
> 在 2022/10/28 9:37, Dan Williams 写道:
> > Darrick J. Wong wrote:
> > > [add tytso to cc since he asked about "How do you actually /get/ fsdax
> > > mode these days?" this morning]
> > >
> > > On Tue, Oct 25, 2022 at 10:56:19AM -0700, Darrick J. Wong wrote:
> > > > On Tue, Oct 25, 2022 at 02:26:50PM +0000, ruansy.fnst@fujitsu.com wrote:
>
> ...skip...
>
> > > >
> > > > Nope. Since the announcement of pmem as a product, I have had 15
> > > > minutes of acces to one preproduction prototype server with actual
> > > > optane DIMMs in them.
> > > >
> > > > I have /never/ had access to real hardware to test any of this, so it's
> > > > all configured via libvirt to simulate pmem in qemu:
> > > > https://lore.kernel.org/linux-xfs/YzXsavOWMSuwTBEC@magnolia/
> > > >
> > > > /run/mtrdisk/[gh].mem are both regular files on a tmpfs filesystem:
> > > >
> > > > $ grep mtrdisk /proc/mounts
> > > > none /run/mtrdisk tmpfs rw,relatime,size=82894848k,inode64 0 0
> > > >
> > > > $ ls -la /run/mtrdisk/[gh].mem
> > > > -rw-r--r-- 1 libvirt-qemu kvm 10739515392 Oct 24 18:09 /run/mtrdisk/g.mem
> > > > -rw-r--r-- 1 libvirt-qemu kvm 10739515392 Oct 24 19:28 /run/mtrdisk/h.mem
> > >
> > > Also forgot to mention that the VM with the fake pmem attached has a
> > > script to do:
> > >
> > > ndctl create-namespace --mode fsdax --map dev -e namespace0.0 -f
> > > ndctl create-namespace --mode fsdax --map dev -e namespace1.0 -f
> > >
> > > Every time the pmem device gets recreated, because apparently that's the
> > > only way to get S_DAX mode nowadays?
> >
> > If you have noticed a change here it is due to VM configuration not
> > anything in the driver.
> >
> > If you are interested there are two ways to get pmem declared the legacy
> > way that predates any of the DAX work, the kernel calls it E820_PRAM,
> > and the modern way by platform firmware tables like ACPI NFIT. The
> > assumption with E820_PRAM is that it is dealing with battery backed
> > NVDIMMs of small capacity. In that case the /dev/pmem device can support
> > DAX operation by default because the necessary memory for the 'struct
> > page' array for that memory is likely small.
> >
> > Platform firmware defined PMEM can be terabytes. So the driver does not
> > enable DAX by default because the user needs to make policy choice about
> > burning gigabytes of DRAM for that metadata, or placing it in PMEM which
> > is abundant, but slower. So what I suspect might be happening is your
> > configuration changed from something that auto-allocated the 'struct
> > page' array, to something that needed those commands you list above to
> > explicitly opt-in to reserving some PMEM capacity for the page metadata.
>
> I am using the same simulation environment as Darrick's and Dave's and have
> tested many times, but still cannot reproduce the failed cases they
> mentioned (dax+non_reflink mode, currently focuing) until now. Only a few
> cases randomly failed because of "target is busy". But IIRC, those failed
> cases you mentioned were failed with dmesg warning around the function
> "dax_associate_entry()" or "dax_disassociate_entry()". Since I cannot
> reproduce the failure, it hard for me to continue sovling the problem.
FWIW things have calmed down as of 6.1-rc3 -- if I disable reflink,
fstests runs without complaint. Now it only seems to be affecting
reflink=1 filesystems.
> And how is your recent test? Still failed with those dmesg warnings? If so,
> could you zip the test result and send it to me?
https://djwong.org/docs/kernel/daxbad.zip
--D
>
>
> --
> Thanks,
> Ruan
--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel
WARNING: multiple messages have this Message-ID (diff)
From: "Darrick J. Wong" <djwong@kernel.org>
To: Shiyang Ruan <ruansy.fnst@fujitsu.com>
Cc: Dan Williams <dan.j.williams@intel.com>,
Dave Chinner <david@fromorbit.com>,
"yangx.jy@fujitsu.com" <yangx.jy@fujitsu.com>,
"Yasunori Gotou (Fujitsu)" <y-goto@fujitsu.com>,
Brian Foster <bfoster@redhat.com>,
"hch@infradead.org" <hch@infradead.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-xfs@vger.kernel.org" <linux-xfs@vger.kernel.org>,
"nvdimm@lists.linux.dev" <nvdimm@lists.linux.dev>,
"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
"zwisler@kernel.org" <zwisler@kernel.org>,
Jeff Moyer <jmoyer@redhat.com>,
"dm-devel@redhat.com" <dm-devel@redhat.com>,
"toshi.kani@hpe.com" <toshi.kani@hpe.com>,
Theodore Ts'o <tytso@mit.edu>
Subject: Re: [PATCH] xfs: fail dax mount if reflink is enabled on a partition
Date: Tue, 1 Nov 2022 17:45:07 -0700 [thread overview]
Message-ID: <Y2G9k9/XJVQ7yiWN@magnolia> (raw)
In-Reply-To: <7a3aac47-1492-a3cc-c53a-53c908f4f857@fujitsu.com>
On Sun, Oct 30, 2022 at 05:31:43PM +0800, Shiyang Ruan wrote:
>
>
> 在 2022/10/28 9:37, Dan Williams 写道:
> > Darrick J. Wong wrote:
> > > [add tytso to cc since he asked about "How do you actually /get/ fsdax
> > > mode these days?" this morning]
> > >
> > > On Tue, Oct 25, 2022 at 10:56:19AM -0700, Darrick J. Wong wrote:
> > > > On Tue, Oct 25, 2022 at 02:26:50PM +0000, ruansy.fnst@fujitsu.com wrote:
>
> ...skip...
>
> > > >
> > > > Nope. Since the announcement of pmem as a product, I have had 15
> > > > minutes of acces to one preproduction prototype server with actual
> > > > optane DIMMs in them.
> > > >
> > > > I have /never/ had access to real hardware to test any of this, so it's
> > > > all configured via libvirt to simulate pmem in qemu:
> > > > https://lore.kernel.org/linux-xfs/YzXsavOWMSuwTBEC@magnolia/
> > > >
> > > > /run/mtrdisk/[gh].mem are both regular files on a tmpfs filesystem:
> > > >
> > > > $ grep mtrdisk /proc/mounts
> > > > none /run/mtrdisk tmpfs rw,relatime,size=82894848k,inode64 0 0
> > > >
> > > > $ ls -la /run/mtrdisk/[gh].mem
> > > > -rw-r--r-- 1 libvirt-qemu kvm 10739515392 Oct 24 18:09 /run/mtrdisk/g.mem
> > > > -rw-r--r-- 1 libvirt-qemu kvm 10739515392 Oct 24 19:28 /run/mtrdisk/h.mem
> > >
> > > Also forgot to mention that the VM with the fake pmem attached has a
> > > script to do:
> > >
> > > ndctl create-namespace --mode fsdax --map dev -e namespace0.0 -f
> > > ndctl create-namespace --mode fsdax --map dev -e namespace1.0 -f
> > >
> > > Every time the pmem device gets recreated, because apparently that's the
> > > only way to get S_DAX mode nowadays?
> >
> > If you have noticed a change here it is due to VM configuration not
> > anything in the driver.
> >
> > If you are interested there are two ways to get pmem declared the legacy
> > way that predates any of the DAX work, the kernel calls it E820_PRAM,
> > and the modern way by platform firmware tables like ACPI NFIT. The
> > assumption with E820_PRAM is that it is dealing with battery backed
> > NVDIMMs of small capacity. In that case the /dev/pmem device can support
> > DAX operation by default because the necessary memory for the 'struct
> > page' array for that memory is likely small.
> >
> > Platform firmware defined PMEM can be terabytes. So the driver does not
> > enable DAX by default because the user needs to make policy choice about
> > burning gigabytes of DRAM for that metadata, or placing it in PMEM which
> > is abundant, but slower. So what I suspect might be happening is your
> > configuration changed from something that auto-allocated the 'struct
> > page' array, to something that needed those commands you list above to
> > explicitly opt-in to reserving some PMEM capacity for the page metadata.
>
> I am using the same simulation environment as Darrick's and Dave's and have
> tested many times, but still cannot reproduce the failed cases they
> mentioned (dax+non_reflink mode, currently focuing) until now. Only a few
> cases randomly failed because of "target is busy". But IIRC, those failed
> cases you mentioned were failed with dmesg warning around the function
> "dax_associate_entry()" or "dax_disassociate_entry()". Since I cannot
> reproduce the failure, it hard for me to continue sovling the problem.
FWIW things have calmed down as of 6.1-rc3 -- if I disable reflink,
fstests runs without complaint. Now it only seems to be affecting
reflink=1 filesystems.
> And how is your recent test? Still failed with those dmesg warnings? If so,
> could you zip the test result and send it to me?
https://djwong.org/docs/kernel/daxbad.zip
--D
>
>
> --
> Thanks,
> Ruan
next prev parent reply other threads:[~2022-11-02 0:45 UTC|newest]
Thread overview: 61+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-09 14:34 [PATCH] xfs: fail dax mount if reflink is enabled on a partition Shiyang Ruan
2022-06-10 5:46 ` Christoph Hellwig
2022-07-01 0:31 ` Darrick J. Wong
2022-07-01 5:14 ` Shiyang Ruan
2022-07-21 12:43 ` ruansy.fnst
2022-07-21 14:06 ` ruansy.fnst
2022-07-21 16:16 ` Darrick J. Wong
2022-07-29 3:55 ` ruansy.fnst
2022-07-29 4:54 ` Darrick J. Wong
2022-08-03 6:47 ` ruansy.fnst
2022-08-04 0:51 ` Darrick J. Wong
2022-08-04 1:36 ` Shiyang Ruan
2022-09-08 13:46 ` Shiyang Ruan
2022-09-09 13:01 ` Brian Foster
2022-09-14 6:44 ` Yang, Xiao/杨 晓
2022-09-14 9:38 ` Yang, Xiao/杨 晓
2022-09-14 12:34 ` Brian Foster
2022-09-14 16:28 ` Darrick J. Wong
2022-09-15 10:14 ` Yang, Xiao/杨 晓
2022-09-16 2:04 ` Yang, Xiao/杨 晓
2022-09-20 2:38 ` Yang, Xiao/杨 晓
2022-09-30 0:56 ` [dm-devel] " Gotou, Yasunori/五島 康文
2022-09-30 0:56 ` Gotou, Yasunori/五島 康文
2022-10-04 0:12 ` [dm-devel] " Darrick J. Wong
2022-10-04 0:12 ` Darrick J. Wong
2022-10-04 4:12 ` [dm-devel] " Gotou, Yasunori/五島 康文
2022-10-04 4:12 ` Gotou, Yasunori/五島 康文
2022-10-04 18:26 ` [dm-devel] " Darrick J. Wong
2022-10-04 18:26 ` Darrick J. Wong
2022-10-20 14:17 ` [dm-devel] " Yang, Xiao/杨 晓
2022-10-20 14:17 ` Yang, Xiao/杨 晓
2022-10-22 2:11 ` [dm-devel] " Darrick J. Wong
2022-10-22 2:11 ` Darrick J. Wong
[not found] ` <09f522cd-e846-12ee-d662-14f34a2977c4@fujitsu.com>
2022-10-23 7:04 ` [dm-devel] " yangx.jy
2022-10-23 7:04 ` yangx.jy
2022-10-23 22:00 ` [dm-devel] " Dave Chinner
2022-10-23 22:00 ` Dave Chinner
2022-10-24 3:17 ` [dm-devel] " ruansy.fnst
2022-10-24 3:17 ` ruansy.fnst
2022-10-24 4:05 ` [dm-devel] " Darrick J. Wong
2022-10-24 4:05 ` Darrick J. Wong
2022-10-24 5:31 ` [dm-devel] " Dave Chinner
2022-10-24 5:31 ` Dave Chinner
2022-10-25 14:26 ` [dm-devel] " ruansy.fnst
2022-10-25 14:26 ` ruansy.fnst
2022-10-25 17:56 ` [dm-devel] " Darrick J. Wong
2022-10-25 17:56 ` Darrick J. Wong
2022-10-27 21:08 ` [dm-devel] " Darrick J. Wong
2022-10-27 21:08 ` Darrick J. Wong
2022-10-28 1:37 ` [dm-devel] " Dan Williams
2022-10-28 1:37 ` Dan Williams
2022-10-30 9:31 ` [dm-devel] " Shiyang Ruan
2022-10-30 9:31 ` Shiyang Ruan
2022-11-02 0:45 ` Darrick J. Wong [this message]
2022-11-02 0:45 ` Darrick J. Wong
2022-11-02 5:17 ` [dm-devel] " ruansy.fnst
2022-11-02 5:17 ` ruansy.fnst
2022-11-03 2:32 ` [dm-devel] " Dave Chinner
2022-11-03 2:32 ` Dave Chinner
2022-10-24 17:12 ` [dm-devel] " Dan Williams
2022-10-24 17:12 ` Dan Williams
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y2G9k9/XJVQ7yiWN@magnolia \
--to=djwong@kernel.org \
--cc=bfoster@redhat.com \
--cc=dan.j.williams@intel.com \
--cc=david@fromorbit.com \
--cc=dm-devel@redhat.com \
--cc=hch@infradead.org \
--cc=jmoyer@redhat.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=nvdimm@lists.linux.dev \
--cc=ruansy.fnst@fujitsu.com \
--cc=toshi.kani@hpe.com \
--cc=tytso@mit.edu \
--cc=y-goto@fujitsu.com \
--cc=yangx.jy@fujitsu.com \
--cc=zwisler@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.