All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: "ruansy.fnst@fujitsu.com" <ruansy.fnst@fujitsu.com>
Cc: "hch@infradead.org" <hch@infradead.org>,
	"toshi.kani@hpe.com" <toshi.kani@hpe.com>,
	"dm-devel@redhat.com" <dm-devel@redhat.com>,
	"nvdimm@lists.linux.dev" <nvdimm@lists.linux.dev>,
	"Darrick J. Wong" <djwong@kernel.org>,
	Brian Foster <bfoster@redhat.com>,
	"yangx.jy@fujitsu.com" <yangx.jy@fujitsu.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"Yasunori Gotou \(Fujitsu\)" <y-goto@fujitsu.com>,
	Jeff Moyer <jmoyer@redhat.com>,
	"zwisler@kernel.org" <zwisler@kernel.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"linux-xfs@vger.kernel.org" <linux-xfs@vger.kernel.org>
Subject: Re: [dm-devel] [PATCH] xfs: fail dax mount if reflink is enabled on a partition
Date: Mon, 24 Oct 2022 16:31:09 +1100	[thread overview]
Message-ID: <20221024053109.GY3600936@dread.disaster.area> (raw)
In-Reply-To: <OSBPR01MB2920CA997DDE891C06776279F42E9@OSBPR01MB2920.jpnprd01.prod.outlook.com>

On Mon, Oct 24, 2022 at 03:17:52AM +0000, ruansy.fnst@fujitsu.com wrote:
> 在 2022/10/24 6:00, Dave Chinner 写道:
> > On Fri, Oct 21, 2022 at 07:11:02PM -0700, Darrick J. Wong wrote:
> >> On Thu, Oct 20, 2022 at 10:17:45PM +0800, Yang, Xiao/杨 晓 wrote:
> >>> In addition, I don't like your idea about the test change because it will
> >>> make generic/470 become the special test for XFS. Do you know if we can fix
> >>> the issue by changing the test in another way? blkdiscard -z can fix the
> >>> issue because it does zero-fill rather than discard on the block device.
> >>> However, blkdiscard -z will take a lot of time when the block device is
> >>> large.
> >>
> >> Well we /could/ just do that too, but that will suck if you have 2TB of
> >> pmem. ;)
> >>
> >> Maybe as an alternative path we could just create a very small
> >> filesystem on the pmem and then blkdiscard -z it?
> >>
> >> That said -- does persistent memory actually have a future?  Intel
> >> scuttled the entire Optane product, cxl.mem sounds like expansion
> >> chassis full of DRAM, and fsdax is horribly broken in 6.0 (weird kernel
> >> asserts everywhere) and 6.1 (every time I run fstests now I see massive
> >> data corruption).
> >
> > Yup, I see the same thing. fsdax was a train wreck in 6.0 - broken
> > on both ext4 and XFS. Now that I run a quick check on 6.1-rc1, I
> > don't think that has changed at all - I still see lots of kernel
> > warnings, data corruption and "XFS_IOC_CLONE_RANGE: Invalid
> > argument" errors.
> 
> Firstly, I think the "XFS_IOC_CLONE_RANGE: Invalid argument" error is
> caused by the restrictions which prevent reflink work together with DAX:
> 
> a. fs/xfs/xfs_ioctl.c:1141
> /* Don't allow us to set DAX mode for a reflinked file for now. */
> if ((fa->fsx_xflags & FS_XFLAG_DAX) && xfs_is_reflink_inode(ip))
>         return -EINVAL;
> 
> b. fs/xfs/xfs_iops.c:1174
> /* Only supported on non-reflinked files. */
> if (xfs_is_reflink_inode(ip))
>         return false;
> 
> These restrictions were removed in "drop experimental warning" patch[1].
>   I think they should be separated from that patch.
> 
> [1]
> https://lore.kernel.org/linux-xfs/1663234002-17-1-git-send-email-ruansy.fnst@fujitsu.com/
> 
> 
> Secondly, how the data corruption happened?

No idea - i"m just reporting that lots of fsx tests failed with data
corruptions. I haven't had time to look at why, I'm still trying to
sort out the fix for a different data corruption...

> Or which case failed?

*lots* of them failed with kernel warnings with reflink turned off:

SECTION       -- xfs_dax_noreflink
=========================
Failures: generic/051 generic/068 generic/075 generic/083
generic/112 generic/127 generic/198 generic/231 generic/247
generic/269 generic/270 generic/340 generic/344 generic/388
generic/461 generic/471 generic/476 generic/519 generic/561 xfs/011
xfs/013 xfs/073 xfs/297 xfs/305 xfs/517 xfs/538
Failed 26 of 1079 tests

All of those except xfs/073 and generic/471 are failures due to
warnings found in dmesg.

With reflink enabled, I terminated the run after g/075, g/091, g/112
and generic/127 reported fsx data corruptions and g/051, g/068,
g/075 and g/083 had reported kernel warnings in dmesg.

> Could
> you give me more info (such as mkfs options, xfstests configs)?

They are exactly the same as last time I reported these problems.

For the "no reflink" test issues:

mkfs options are "-m reflink=0,rmapbt=1", mount options "-o
dax=always" for both filesytems.  Config output at start of test
run:

SECTION       -- xfs_dax_noreflink
FSTYP         -- xfs (debug)
PLATFORM      -- Linux/x86_64 test3 6.1.0-rc1-dgc+ #1615 SMP PREEMPT_DYNAMIC Wed Oct 19 12:24:16 AEDT 2022
MKFS_OPTIONS  -- -f -m reflink=0,rmapbt=1 /dev/pmem1
MOUNT_OPTIONS -- -o dax=always -o context=system_u:object_r:root_t:s0 /dev/pmem1 /mnt/scratch

pmem devices are a pair of fake 8GB pmem regions set up by kernel
CLI via "memmap=8G!15G,8G!24G". I don't have anything special set up
- the kernel config is kept minimal for these VMs - and the only
kernel debug option I have turned on for these specific test runs is
CONFIG_XFS_DEBUG=y.

THe only difference between the noreflink and reflink runs is that I
drop the "-m reflink=0" mkfs parameter. Otherwise they are identical
and the errors I reported are from back-to-back fstests runs without
rebooting the VM....

-Dave.
-- 
Dave Chinner
david@fromorbit.com

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel

WARNING: multiple messages have this Message-ID (diff)
From: Dave Chinner <david@fromorbit.com>
To: "ruansy.fnst@fujitsu.com" <ruansy.fnst@fujitsu.com>
Cc: "Darrick J. Wong" <djwong@kernel.org>,
	"yangx.jy@fujitsu.com" <yangx.jy@fujitsu.com>,
	"Yasunori Gotou (Fujitsu)" <y-goto@fujitsu.com>,
	Brian Foster <bfoster@redhat.com>,
	"hch@infradead.org" <hch@infradead.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-xfs@vger.kernel.org" <linux-xfs@vger.kernel.org>,
	"nvdimm@lists.linux.dev" <nvdimm@lists.linux.dev>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"zwisler@kernel.org" <zwisler@kernel.org>,
	Jeff Moyer <jmoyer@redhat.com>,
	"dm-devel@redhat.com" <dm-devel@redhat.com>,
	"toshi.kani@hpe.com" <toshi.kani@hpe.com>
Subject: Re: [PATCH] xfs: fail dax mount if reflink is enabled on a partition
Date: Mon, 24 Oct 2022 16:31:09 +1100	[thread overview]
Message-ID: <20221024053109.GY3600936@dread.disaster.area> (raw)
In-Reply-To: <OSBPR01MB2920CA997DDE891C06776279F42E9@OSBPR01MB2920.jpnprd01.prod.outlook.com>

On Mon, Oct 24, 2022 at 03:17:52AM +0000, ruansy.fnst@fujitsu.com wrote:
> 在 2022/10/24 6:00, Dave Chinner 写道:
> > On Fri, Oct 21, 2022 at 07:11:02PM -0700, Darrick J. Wong wrote:
> >> On Thu, Oct 20, 2022 at 10:17:45PM +0800, Yang, Xiao/杨 晓 wrote:
> >>> In addition, I don't like your idea about the test change because it will
> >>> make generic/470 become the special test for XFS. Do you know if we can fix
> >>> the issue by changing the test in another way? blkdiscard -z can fix the
> >>> issue because it does zero-fill rather than discard on the block device.
> >>> However, blkdiscard -z will take a lot of time when the block device is
> >>> large.
> >>
> >> Well we /could/ just do that too, but that will suck if you have 2TB of
> >> pmem. ;)
> >>
> >> Maybe as an alternative path we could just create a very small
> >> filesystem on the pmem and then blkdiscard -z it?
> >>
> >> That said -- does persistent memory actually have a future?  Intel
> >> scuttled the entire Optane product, cxl.mem sounds like expansion
> >> chassis full of DRAM, and fsdax is horribly broken in 6.0 (weird kernel
> >> asserts everywhere) and 6.1 (every time I run fstests now I see massive
> >> data corruption).
> >
> > Yup, I see the same thing. fsdax was a train wreck in 6.0 - broken
> > on both ext4 and XFS. Now that I run a quick check on 6.1-rc1, I
> > don't think that has changed at all - I still see lots of kernel
> > warnings, data corruption and "XFS_IOC_CLONE_RANGE: Invalid
> > argument" errors.
> 
> Firstly, I think the "XFS_IOC_CLONE_RANGE: Invalid argument" error is
> caused by the restrictions which prevent reflink work together with DAX:
> 
> a. fs/xfs/xfs_ioctl.c:1141
> /* Don't allow us to set DAX mode for a reflinked file for now. */
> if ((fa->fsx_xflags & FS_XFLAG_DAX) && xfs_is_reflink_inode(ip))
>         return -EINVAL;
> 
> b. fs/xfs/xfs_iops.c:1174
> /* Only supported on non-reflinked files. */
> if (xfs_is_reflink_inode(ip))
>         return false;
> 
> These restrictions were removed in "drop experimental warning" patch[1].
>   I think they should be separated from that patch.
> 
> [1]
> https://lore.kernel.org/linux-xfs/1663234002-17-1-git-send-email-ruansy.fnst@fujitsu.com/
> 
> 
> Secondly, how the data corruption happened?

No idea - i"m just reporting that lots of fsx tests failed with data
corruptions. I haven't had time to look at why, I'm still trying to
sort out the fix for a different data corruption...

> Or which case failed?

*lots* of them failed with kernel warnings with reflink turned off:

SECTION       -- xfs_dax_noreflink
=========================
Failures: generic/051 generic/068 generic/075 generic/083
generic/112 generic/127 generic/198 generic/231 generic/247
generic/269 generic/270 generic/340 generic/344 generic/388
generic/461 generic/471 generic/476 generic/519 generic/561 xfs/011
xfs/013 xfs/073 xfs/297 xfs/305 xfs/517 xfs/538
Failed 26 of 1079 tests

All of those except xfs/073 and generic/471 are failures due to
warnings found in dmesg.

With reflink enabled, I terminated the run after g/075, g/091, g/112
and generic/127 reported fsx data corruptions and g/051, g/068,
g/075 and g/083 had reported kernel warnings in dmesg.

> Could
> you give me more info (such as mkfs options, xfstests configs)?

They are exactly the same as last time I reported these problems.

For the "no reflink" test issues:

mkfs options are "-m reflink=0,rmapbt=1", mount options "-o
dax=always" for both filesytems.  Config output at start of test
run:

SECTION       -- xfs_dax_noreflink
FSTYP         -- xfs (debug)
PLATFORM      -- Linux/x86_64 test3 6.1.0-rc1-dgc+ #1615 SMP PREEMPT_DYNAMIC Wed Oct 19 12:24:16 AEDT 2022
MKFS_OPTIONS  -- -f -m reflink=0,rmapbt=1 /dev/pmem1
MOUNT_OPTIONS -- -o dax=always -o context=system_u:object_r:root_t:s0 /dev/pmem1 /mnt/scratch

pmem devices are a pair of fake 8GB pmem regions set up by kernel
CLI via "memmap=8G!15G,8G!24G". I don't have anything special set up
- the kernel config is kept minimal for these VMs - and the only
kernel debug option I have turned on for these specific test runs is
CONFIG_XFS_DEBUG=y.

THe only difference between the noreflink and reflink runs is that I
drop the "-m reflink=0" mkfs parameter. Otherwise they are identical
and the errors I reported are from back-to-back fstests runs without
rebooting the VM....

-Dave.
-- 
Dave Chinner
david@fromorbit.com

  parent reply	other threads:[~2022-10-24  5:31 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-09 14:34 [PATCH] xfs: fail dax mount if reflink is enabled on a partition Shiyang Ruan
2022-06-10  5:46 ` Christoph Hellwig
2022-07-01  0:31 ` Darrick J. Wong
2022-07-01  5:14   ` Shiyang Ruan
2022-07-21 12:43     ` ruansy.fnst
2022-07-21 14:06   ` ruansy.fnst
2022-07-21 16:16     ` Darrick J. Wong
2022-07-29  3:55       ` ruansy.fnst
2022-07-29  4:54         ` Darrick J. Wong
2022-08-03  6:47           ` ruansy.fnst
2022-08-04  0:51             ` Darrick J. Wong
2022-08-04  1:36               ` Shiyang Ruan
2022-09-08 13:46               ` Shiyang Ruan
2022-09-09 13:01                 ` Brian Foster
2022-09-14  6:44                   ` Yang, Xiao/杨 晓
2022-09-14  9:38                     ` Yang, Xiao/杨 晓
2022-09-14 12:34                       ` Brian Foster
2022-09-14 16:28                         ` Darrick J. Wong
2022-09-15 10:14                           ` Yang, Xiao/杨 晓
2022-09-16  2:04                             ` Yang, Xiao/杨 晓
2022-09-20  2:38                               ` Yang, Xiao/杨 晓
2022-09-30  0:56                                 ` [dm-devel] " Gotou, Yasunori/五島 康文
2022-09-30  0:56                                   ` Gotou, Yasunori/五島 康文
2022-10-04  0:12                                   ` [dm-devel] " Darrick J. Wong
2022-10-04  0:12                                     ` Darrick J. Wong
2022-10-04  4:12                                     ` [dm-devel] " Gotou, Yasunori/五島 康文
2022-10-04  4:12                                       ` Gotou, Yasunori/五島 康文
2022-10-04 18:26                                       ` [dm-devel] " Darrick J. Wong
2022-10-04 18:26                                         ` Darrick J. Wong
2022-10-20 14:17                                         ` [dm-devel] " Yang, Xiao/杨 晓
2022-10-20 14:17                                           ` Yang, Xiao/杨 晓
2022-10-22  2:11                                           ` [dm-devel] " Darrick J. Wong
2022-10-22  2:11                                             ` Darrick J. Wong
     [not found]                                             ` <09f522cd-e846-12ee-d662-14f34a2977c4@fujitsu.com>
2022-10-23  7:04                                               ` [dm-devel] " yangx.jy
2022-10-23  7:04                                                 ` yangx.jy
2022-10-23 22:00                                             ` [dm-devel] " Dave Chinner
2022-10-23 22:00                                               ` Dave Chinner
2022-10-24  3:17                                               ` [dm-devel] " ruansy.fnst
2022-10-24  3:17                                                 ` ruansy.fnst
2022-10-24  4:05                                                 ` [dm-devel] " Darrick J. Wong
2022-10-24  4:05                                                   ` Darrick J. Wong
2022-10-24  5:31                                                 ` Dave Chinner [this message]
2022-10-24  5:31                                                   ` Dave Chinner
2022-10-25 14:26                                                   ` [dm-devel] " ruansy.fnst
2022-10-25 14:26                                                     ` ruansy.fnst
2022-10-25 17:56                                                     ` [dm-devel] " Darrick J. Wong
2022-10-25 17:56                                                       ` Darrick J. Wong
2022-10-27 21:08                                                       ` [dm-devel] " Darrick J. Wong
2022-10-27 21:08                                                         ` Darrick J. Wong
2022-10-28  1:37                                                         ` [dm-devel] " Dan Williams
2022-10-28  1:37                                                           ` Dan Williams
2022-10-30  9:31                                                           ` [dm-devel] " Shiyang Ruan
2022-10-30  9:31                                                             ` Shiyang Ruan
2022-11-02  0:45                                                             ` [dm-devel] " Darrick J. Wong
2022-11-02  0:45                                                               ` Darrick J. Wong
2022-11-02  5:17                                                               ` [dm-devel] " ruansy.fnst
2022-11-02  5:17                                                                 ` ruansy.fnst
2022-11-03  2:32                                                                 ` [dm-devel] " Dave Chinner
2022-11-03  2:32                                                                   ` Dave Chinner
2022-10-24 17:12                                               ` [dm-devel] " Dan Williams
2022-10-24 17:12                                                 ` Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221024053109.GY3600936@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=bfoster@redhat.com \
    --cc=djwong@kernel.org \
    --cc=dm-devel@redhat.com \
    --cc=hch@infradead.org \
    --cc=jmoyer@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=nvdimm@lists.linux.dev \
    --cc=ruansy.fnst@fujitsu.com \
    --cc=toshi.kani@hpe.com \
    --cc=y-goto@fujitsu.com \
    --cc=yangx.jy@fujitsu.com \
    --cc=zwisler@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.