From: Olga Kornievskaia <aglo@umich.edu>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: sfrench@samba.org, Anna Schumaker <anna.schumaker@netapp.com>,
trond.myklebust@hammerspace.com, fengxiaoli0714@gmail.com,
fstests@vger.kernel.org, Murphy Zhou <xzhou@redhat.com>,
CIFS <linux-cifs@vger.kernel.org>,
linux-nfs <linux-nfs@vger.kernel.org>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: NFS & CIFS support dedupe now?? Was: Re: [PATCH] generic/517: notrun on NFS due to unaligned dedupe in test
Date: Fri, 31 May 2019 11:24:42 -0400 [thread overview]
Message-ID: <CAN-5tyFoNJrQTn1ZkNsmaBd=yn2kWpZObGe8ka6CDFxcHaXP6w@mail.gmail.com> (raw)
In-Reply-To: <20190530155851.GB5383@magnolia>
On Thu, May 30, 2019 at 12:02 PM Darrick J. Wong
<darrick.wong@oracle.com> wrote:
>
> Hi everyone,
>
> Murphy Zhou sent a patch to generic/517 in fstests to fix a dedupe
> failure he was seeing on NFS:
>
> On Thu, May 30, 2019 at 05:41:47PM +0800, Murphy Zhou wrote:
> > NFSv4.2 could pass _require_scratch_dedupe, since the test offset and
> > size are aligned, while generic/517 is performing unaligned dedupe.
> > NFS does not support unaligned dedupe now, returns EINVAL.
> >
> > Signed-off-by: Murphy Zhou <xzhou@redhat.com>
> > ---
> > tests/generic/517 | 1 +
> > 1 file changed, 1 insertion(+)
> >
> > diff --git a/tests/generic/517 b/tests/generic/517
> > index 601bb24e..23665782 100755
> > --- a/tests/generic/517
> > +++ b/tests/generic/517
> > @@ -30,6 +30,7 @@ _cleanup()
> > _supported_fs generic
> > _supported_os Linux
> > _require_scratch_dedupe
> > +$FSTYP == "nfs" && _notrun "NFS can't handle unaligned deduplication"
>
> I was surprised to see a dedupe fix for NFS since (at least to my
> knowledge) neither of these two network filesystems actually support
> server-side deduplication commands, and therefore the
> _require_scratch_dedupe should have _notrun the test.
>
> Then I looked at fs/nfs/nfs4file.c:
>
> static loff_t nfs42_remap_file_range(struct file *src_file, loff_t src_off,
> struct file *dst_file, loff_t dst_off, loff_t count,
> unsigned int remap_flags)
> {
> <local variable declarations>
>
> if (remap_flags & ~(REMAP_FILE_DEDUP | REMAP_FILE_ADVISORY))
> return -EINVAL;
>
> <check alignment, lock inodes, flush pending writes>
>
> ret = nfs42_proc_clone(src_file, dst_file, src_off, dst_off, count);
>
> The NFS client code will accept REMAP_FILE_DEDUP through remap_flags,
> which is how dedupe requests are sent to filesystems nowadays. The nfs
> client code does not itself compare the file contents, but it does issue
> a CLONE command to the NFS server. The end result, AFAICT, is that a
> user program can write 'A's to file1, 'B's to file2, issue a dedup
> ioctl to the kernel, and have a block of 'B's mapped into file1. That's
> broken behavior, according to the dedup ioctl manpage.
>
> Notice how remap_flags is checked but is not included in the
> nfs42_proc_clone call? That's how I conclude that the NFS client cannot
> possibly be sending the dedup request to the server.
>
> The same goes for fs/cifs/cifsfs.c:
>
> static loff_t cifs_remap_file_range(struct file *src_file, loff_t off,
> struct file *dst_file, loff_t destoff, loff_t len,
> unsigned int remap_flags)
> {
> <local variable declarations>
>
> if (remap_flags & ~(REMAP_FILE_DEDUP | REMAP_FILE_ADVISORY))
> return -EINVAL;
>
> <check files, lock inodes, flush pages>
>
> if (target_tcon->ses->server->ops->duplicate_extents)
> rc = target_tcon->ses->server->ops->duplicate_extents(xid,
> smb_file_src, smb_file_target, off, len, destoff);
> else
> rc = -EOPNOTSUPP;
>
> Again, remap_flags is checked here but it has no influence over the
> ->duplicate_extents call.
>
> Next I got to thinking that when I reworked the clone/dedupe code last
> year, I didn't include REMAP_FILE_DEDUP support for cifs or nfs, because
> as far as I knew, neither protocol supports a verb for deduplication.
> The remap_flags checks were modified to allow REMAP_FILE_DEDUP in
> commits ce96e888fe48e (NFS) and b073a08016a10 (CIFS) with this
> justification (the cifs commit has a similar message):
>
> "Subject: Fix nfs4.2 return -EINVAL when do dedupe operation
>
> "dedupe_file_range operations is combiled into remap_file_range.
> " But in nfs42_remap_file_range, it's skiped for dedupe operations.
> " Before this patch:
> " # dd if=/dev/zero of=nfs/file bs=1M count=1
> " # xfs_io -c "dedupe nfs/file 4k 64k 4k" nfs/file
> " XFS_IOC_FILE_EXTENT_SAME: Invalid argument
> " After this patch:
> " # dd if=/dev/zero of=nfs/file bs=1M count=1
> " # xfs_io -c "dedupe nfs/file 4k 64k 4k" nfs/file
> " deduped 4096/4096 bytes at offset 65536
> " 4 KiB, 1 ops; 0.0046 sec (865.988 KiB/sec and 216.4971 ops/sec)"
>
> This sort of looks like monkeypatching to make an error message go away.
> One could argue that this ought to return EOPNOSUPP instead of EINVAL,
> and maybe that's what should've happened.
>
> So, uh, do NFS and CIFS both support server-side dedupe now, or are
> these patches just plain wrong?
>
> No, they're just wrong, because I can corrupt files like so on NFS:
>
> $ rm -rf urk moo
> $ xfs_io -f -c "pwrite -S 0x58 0 31048" urk
> wrote 31048/31048 bytes at offset 0
> 30 KiB, 8 ops; 0.0000 sec (569.417 MiB/sec and 153846.1538 ops/sec)
> $ xfs_io -f -c "pwrite -S 0x59 0 31048" moo
> wrote 31048/31048 bytes at offset 0
> 30 KiB, 8 ops; 0.0001 sec (177.303 MiB/sec and 47904.1916 ops/sec)
> $ md5sum urk moo
> 37d3713e5f9c4fe0f8a1f813b27cb284 urk
> a5b6f953f27aa17e42450ff4674fa2df moo
> $ xfs_io -c "dedupe urk 0 0 4096" moo
> deduped 4096/4096 bytes at offset 0
> 4 KiB, 1 ops; 0.0012 sec (3.054 MiB/sec and 781.8608 ops/sec)
> $ md5sum urk moo
> 37d3713e5f9c4fe0f8a1f813b27cb284 urk
> 2c992d70131c489da954f1d96d8c456e moo
>
> (Not sure about cifs, since I don't have a Windows Server handy)
>
> I'm not an expert in CIFS or NFS, so I'm asking: do either support
> dedupe or is this a kernel bug?
NFS does not support dedupe and only supports cloning (whole) files.
next prev parent reply other threads:[~2019-05-31 15:24 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20190530094147.14512-1-xzhou@redhat.com>
[not found] ` <20190530152606.GA5383@magnolia>
2019-05-30 15:58 ` NFS & CIFS support dedupe now?? Was: Re: [PATCH] generic/517: notrun on NFS due to unaligned dedupe in test Darrick J. Wong
2019-05-31 10:48 ` Aurélien Aptel
2019-05-31 13:28 ` Tom Talpey
2019-05-31 15:24 ` Olga Kornievskaia [this message]
2019-05-31 15:35 ` Trond Myklebust
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAN-5tyFoNJrQTn1ZkNsmaBd=yn2kWpZObGe8ka6CDFxcHaXP6w@mail.gmail.com' \
--to=aglo@umich.edu \
--cc=anna.schumaker@netapp.com \
--cc=darrick.wong@oracle.com \
--cc=fengxiaoli0714@gmail.com \
--cc=fstests@vger.kernel.org \
--cc=linux-cifs@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=sfrench@samba.org \
--cc=trond.myklebust@hammerspace.com \
--cc=xzhou@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).