* [PATCH-RFC-RESEND 1/9] vfs: pull btrfs clone API to vfs layer
2015-08-26 8:16 [PATCH-RFC-RESEND 0/9] NFS/NFSD: add NFSv42 CLONE operation support Peng Tao
@ 2015-08-26 8:16 ` Peng Tao
[not found] ` <1440577010-122867-2-git-send-email-tao.peng-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>
2015-08-26 8:16 ` [PATCH-RFC-RESEND 2/9] vfs/btrfs: add .clone_range file operation Peng Tao
` (7 subsequent siblings)
8 siblings, 1 reply; 22+ messages in thread
From: Peng Tao @ 2015-08-26 8:16 UTC (permalink / raw)
To: linux-fsdevel
Cc: Trond Myklebust, Anna Schumaker, Christoph Hellwig, Zach Brown,
Darren Hart, bfields, Jeff Layton, linux-nfs, Darrick J. Wong,
linux-btrfs, Peng Tao
Now that a few file systems are adding clone functionality, namingly
btrfs, NFS (later in the series) and XFS
(ttp://oss.sgi.com/archives/xfs/2015-06/msg00407.html), it makes sense
to pull the ioctl to common code.
Add vfs_file_clone_range() helper and .clone_range file operation interface
to allow underlying filesystems to clone between regular files.
The change in do_vfs_ioctl() is defered to next patch where btrfs
.clone_range is added, just so that we don't break btrfs CLONE ioctl
with this patch.
Signed-off-by: Peng Tao <tao.peng@primarydata.com>
---
fs/ioctl.c | 24 ++++++++++++++++++++++++
fs/read_write.c | 45 +++++++++++++++++++++++++++++++++++++++++++++
include/linux/fs.h | 4 ++++
include/uapi/linux/fs.h | 9 +++++++++
4 files changed, 82 insertions(+)
diff --git a/fs/ioctl.c b/fs/ioctl.c
index 5d01d26..726c5d7 100644
--- a/fs/ioctl.c
+++ b/fs/ioctl.c
@@ -215,6 +215,30 @@ static int ioctl_fiemap(struct file *filp, unsigned long arg)
return error;
}
+static long ioctl_file_clone(struct file *dst_file, unsigned long srcfd,
+ u64 off, u64 olen, u64 destoff)
+{
+ struct fd src_file = fdget(srcfd);
+ int ret;
+
+ if (!src_file.file)
+ return -EBADF;
+ ret = vfs_file_clone_range(src_file.file, dst_file, off, olen, destoff);
+
+ fdput(src_file);
+ return ret;
+}
+
+static long ioctl_file_clone_range(struct file *file, void __user *argp)
+{
+ struct file_clone_range args;
+
+ if (copy_from_user(&args, argp, sizeof(args)))
+ return -EFAULT;
+ return ioctl_file_clone(file, args.src_fd, args.src_offset,
+ args.src_length, args.dest_offset);
+}
+
#ifdef CONFIG_BLOCK
static inline sector_t logical_to_blk(struct inode *inode, loff_t offset)
diff --git a/fs/read_write.c b/fs/read_write.c
index 819ef3f..beaad2c 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -16,6 +16,7 @@
#include <linux/pagemap.h>
#include <linux/splice.h>
#include <linux/compat.h>
+#include <linux/mount.h>
#include "internal.h"
#include <asm/uaccess.h>
@@ -1327,3 +1328,47 @@ COMPAT_SYSCALL_DEFINE4(sendfile64, int, out_fd, int, in_fd,
return do_sendfile(out_fd, in_fd, NULL, count, 0);
}
#endif
+
+int vfs_file_clone_range(struct file *src_file, struct file *dst_file,
+ loff_t off, size_t len, loff_t dstoff)
+{
+ struct inode *src_ino;
+ struct inode *dst_ino;
+ ssize_t ret;
+
+ if (!(src_file->f_mode & FMODE_READ) ||
+ !(dst_file->f_mode & FMODE_WRITE) ||
+ (dst_file->f_flags & O_APPEND) ||
+ !src_file->f_op || !src_file->f_op->clone_range)
+ return -EINVAL;
+
+ src_ino = file_inode(src_file);
+ dst_ino = file_inode(dst_file);
+
+ if (S_ISDIR(src_ino->i_mode) || S_ISDIR(dst_ino->i_mode))
+ return -EISDIR;
+
+ /* sanity check on offsets and length */
+ if (off + len < off || dstoff + len < dstoff ||
+ off + len > i_size_read(src_ino))
+ return -EINVAL;
+
+ if (src_ino->i_sb != dst_ino->i_sb ||
+ src_file->f_path.mnt != dst_file->f_path.mnt)
+ return -EXDEV;
+
+ ret = mnt_want_write_file(dst_file);
+ if (ret)
+ return ret;
+
+ ret = src_file->f_op->clone_range(src_file, dst_file, off, len, dstoff);
+ if (!ret) {
+ fsnotify_access(src_file);
+ fsnotify_modify(dst_file);
+ }
+
+ mnt_drop_write_file(dst_file);
+
+ return ret;
+}
+EXPORT_SYMBOL(vfs_file_clone_range);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index cc008c3..612d7f4 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1628,6 +1628,8 @@ struct file_operations {
long (*fallocate)(struct file *file, int mode, loff_t offset,
loff_t len);
void (*show_fdinfo)(struct seq_file *m, struct file *f);
+ int (*clone_range)(struct file *src_file, struct file *dst_file,
+ loff_t off, size_t len, loff_t dstoff);
#ifndef CONFIG_MMU
unsigned (*mmap_capabilities)(struct file *);
#endif
@@ -2678,6 +2680,8 @@ int __dax_fault(struct vm_area_struct *, struct vm_fault *, get_block_t,
int dax_pfn_mkwrite(struct vm_area_struct *, struct vm_fault *);
#define dax_mkwrite(vma, vmf, gb, iod) dax_fault(vma, vmf, gb, iod)
#define __dax_mkwrite(vma, vmf, gb, iod) __dax_fault(vma, vmf, gb, iod)
+int vfs_file_clone_range(struct file *src_file, struct file *dst_file,
+ loff_t off, size_t len, loff_t dstoff);
#ifdef CONFIG_BLOCK
typedef void (dio_submit_t)(int rw, struct bio *bio, struct inode *inode,
diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
index 9b964a5..ac7f1c5 100644
--- a/include/uapi/linux/fs.h
+++ b/include/uapi/linux/fs.h
@@ -39,6 +39,13 @@
#define RENAME_EXCHANGE (1 << 1) /* Exchange source and dest */
#define RENAME_WHITEOUT (1 << 2) /* Whiteout source */
+struct file_clone_range {
+ __s64 src_fd;
+ __u64 src_offset;
+ __u64 src_length;
+ __u64 dest_offset;
+};
+
struct fstrim_range {
__u64 start;
__u64 len;
@@ -159,6 +166,8 @@ struct inodes_stat_t {
#define FIFREEZE _IOWR('X', 119, int) /* Freeze */
#define FITHAW _IOWR('X', 120, int) /* Thaw */
#define FITRIM _IOWR('X', 121, struct fstrim_range) /* Trim */
+#define FICLONE _IOW(0x94, 9, int) /* Clone */
+#define FICLONERANGE _IOW(0x94, 13, struct file_clone_range) /* Clone range */
#define FS_IOC_GETFLAGS _IOR('f', 1, long)
#define FS_IOC_SETFLAGS _IOW('f', 2, long)
--
1.8.3.1
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH-RFC-RESEND 2/9] vfs/btrfs: add .clone_range file operation
2015-08-26 8:16 [PATCH-RFC-RESEND 0/9] NFS/NFSD: add NFSv42 CLONE operation support Peng Tao
2015-08-26 8:16 ` [PATCH-RFC-RESEND 1/9] vfs: pull btrfs clone API to vfs layer Peng Tao
@ 2015-08-26 8:16 ` Peng Tao
2015-08-26 13:17 ` Christoph Hellwig
2015-08-26 8:16 ` [PATCH-RFC-RESEND 3/9] nfs42: add CLONE xdr functions Peng Tao
` (6 subsequent siblings)
8 siblings, 1 reply; 22+ messages in thread
From: Peng Tao @ 2015-08-26 8:16 UTC (permalink / raw)
To: linux-fsdevel
Cc: Trond Myklebust, Anna Schumaker, Christoph Hellwig, Zach Brown,
Darren Hart, bfields, Jeff Layton, linux-nfs, Darrick J. Wong,
linux-btrfs, Peng Tao
Also add vfs callers in do_vfs_ioctl(). Now btrfs CLONE
ioctl goes through vfs_file_clone_range().
One question to btrfs developers:
Does btrfs_clone() ever return a positive value? If it does, what does it mean
to users?
Signed-off-by: Peng Tao <tao.peng@primarydata.com>
---
fs/btrfs/ctree.h | 2 ++
fs/btrfs/file.c | 1 +
fs/btrfs/ioctl.c | 68 +++++-----------------------------------------
fs/ioctl.c | 6 ++++
include/uapi/linux/btrfs.h | 10 -------
5 files changed, 16 insertions(+), 71 deletions(-)
diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index aac314e..af3e224 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -3969,6 +3969,8 @@ void btrfs_get_block_group_info(struct list_head *groups_list,
struct btrfs_ioctl_space_info *space);
void update_ioctl_balance_args(struct btrfs_fs_info *fs_info, int lock,
struct btrfs_ioctl_balance_args *bargs);
+int btrfs_file_clone_range(struct file *src_file, struct file *dst_file,
+ loff_t off, size_t olen, loff_t destoff);
/* file.c */
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index b823fac..c9170fd 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -2813,6 +2813,7 @@ const struct file_operations btrfs_file_operations = {
.fsync = btrfs_sync_file,
.fallocate = btrfs_fallocate,
.unlocked_ioctl = btrfs_ioctl,
+ .clone_range = btrfs_file_clone_range,
#ifdef CONFIG_COMPAT
.compat_ioctl = btrfs_ioctl,
#endif
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 0770c91..3d026c8 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -3719,13 +3719,12 @@ out:
return ret;
}
-static noinline long btrfs_ioctl_clone(struct file *file, unsigned long srcfd,
- u64 off, u64 olen, u64 destoff)
+int btrfs_file_clone_range(struct file *src_file, struct file *dst_file,
+ loff_t off, size_t olen, loff_t destoff)
{
- struct inode *inode = file_inode(file);
+ struct inode *inode = file_inode(dst_file);
struct btrfs_root *root = BTRFS_I(inode)->root;
- struct fd src_file;
- struct inode *src;
+ struct inode *src = file_inode(src_file);
int ret;
u64 len = olen;
u64 bs = root->fs_info->sb->s_blocksize;
@@ -3742,49 +3741,16 @@ static noinline long btrfs_ioctl_clone(struct file *file, unsigned long srcfd,
* be either compressed or non-compressed.
*/
- /* the destination must be opened for writing */
- if (!(file->f_mode & FMODE_WRITE) || (file->f_flags & O_APPEND))
- return -EINVAL;
-
if (btrfs_root_readonly(root))
return -EROFS;
- ret = mnt_want_write_file(file);
- if (ret)
- return ret;
-
- src_file = fdget(srcfd);
- if (!src_file.file) {
- ret = -EBADF;
- goto out_drop_write;
- }
-
- ret = -EXDEV;
- if (src_file.file->f_path.mnt != file->f_path.mnt)
- goto out_fput;
-
- src = file_inode(src_file.file);
-
- ret = -EINVAL;
- if (src == inode)
- same_inode = 1;
-
- /* the src must be open for reading */
- if (!(src_file.file->f_mode & FMODE_READ))
- goto out_fput;
-
/* don't make the dst file partly checksummed */
if ((BTRFS_I(src)->flags & BTRFS_INODE_NODATASUM) !=
(BTRFS_I(inode)->flags & BTRFS_INODE_NODATASUM))
- goto out_fput;
-
- ret = -EISDIR;
- if (S_ISDIR(src->i_mode) || S_ISDIR(inode->i_mode))
- goto out_fput;
+ return -EINVAL;
- ret = -EXDEV;
- if (src->i_sb != inode->i_sb)
- goto out_fput;
+ if (src == inode)
+ same_inode = 1;
if (!same_inode) {
if (inode < src) {
@@ -3800,8 +3766,6 @@ static noinline long btrfs_ioctl_clone(struct file *file, unsigned long srcfd,
/* determine range to clone */
ret = -EINVAL;
- if (off + len > src->i_size || off + len < off)
- goto out_unlock;
if (len == 0)
olen = len = src->i_size - off;
/* if we extend to eof, continue to block boundary */
@@ -3877,23 +3841,9 @@ out_unlock:
} else {
mutex_unlock(&src->i_mutex);
}
-out_fput:
- fdput(src_file);
-out_drop_write:
- mnt_drop_write_file(file);
return ret;
}
-static long btrfs_ioctl_clone_range(struct file *file, void __user *argp)
-{
- struct btrfs_ioctl_clone_range_args args;
-
- if (copy_from_user(&args, argp, sizeof(args)))
- return -EFAULT;
- return btrfs_ioctl_clone(file, args.src_fd, args.src_offset,
- args.src_length, args.dest_offset);
-}
-
/*
* there are many ways the trans_start and trans_end ioctls can lead
* to deadlocks. They should only be used by applications that
@@ -5433,10 +5383,6 @@ long btrfs_ioctl(struct file *file, unsigned int
return btrfs_ioctl_dev_info(root, argp);
case BTRFS_IOC_BALANCE:
return btrfs_ioctl_balance(file, NULL);
- case BTRFS_IOC_CLONE:
- return btrfs_ioctl_clone(file, arg, 0, 0, 0);
- case BTRFS_IOC_CLONE_RANGE:
- return btrfs_ioctl_clone_range(file, argp);
case BTRFS_IOC_TRANS_START:
return btrfs_ioctl_trans_start(file);
case BTRFS_IOC_TRANS_END:
diff --git a/fs/ioctl.c b/fs/ioctl.c
index 726c5d7..41c6080 100644
--- a/fs/ioctl.c
+++ b/fs/ioctl.c
@@ -624,6 +624,12 @@ int do_vfs_ioctl(struct file *filp, unsigned int fd, unsigned int cmd,
case FIGETBSZ:
return put_user(inode->i_sb->s_blocksize, argp);
+ case FICLONE:
+ return ioctl_file_clone(filp, arg, 0, 0, 0);
+
+ case FICLONERANGE:
+ return ioctl_file_clone_range(filp, argp);
+
default:
if (S_ISREG(inode->i_mode))
error = file_ioctl(filp, cmd, arg);
diff --git a/include/uapi/linux/btrfs.h b/include/uapi/linux/btrfs.h
index b6dec05..c3a23d3 100644
--- a/include/uapi/linux/btrfs.h
+++ b/include/uapi/linux/btrfs.h
@@ -316,12 +316,6 @@ struct btrfs_ioctl_search_args_v2 {
__u64 buf[0]; /* out - found items */
};
-struct btrfs_ioctl_clone_range_args {
- __s64 src_fd;
- __u64 src_offset, src_length;
- __u64 dest_offset;
-};
-
/* flags for the defrag range ioctl */
#define BTRFS_DEFRAG_RANGE_COMPRESS 1
#define BTRFS_DEFRAG_RANGE_START_IO 2
@@ -548,7 +542,6 @@ static inline char *btrfs_err_str(enum btrfs_err_code err_code)
#define BTRFS_IOC_TRANS_END _IO(BTRFS_IOCTL_MAGIC, 7)
#define BTRFS_IOC_SYNC _IO(BTRFS_IOCTL_MAGIC, 8)
-#define BTRFS_IOC_CLONE _IOW(BTRFS_IOCTL_MAGIC, 9, int)
#define BTRFS_IOC_ADD_DEV _IOW(BTRFS_IOCTL_MAGIC, 10, \
struct btrfs_ioctl_vol_args)
#define BTRFS_IOC_RM_DEV _IOW(BTRFS_IOCTL_MAGIC, 11, \
@@ -556,9 +549,6 @@ static inline char *btrfs_err_str(enum btrfs_err_code err_code)
#define BTRFS_IOC_BALANCE _IOW(BTRFS_IOCTL_MAGIC, 12, \
struct btrfs_ioctl_vol_args)
-#define BTRFS_IOC_CLONE_RANGE _IOW(BTRFS_IOCTL_MAGIC, 13, \
- struct btrfs_ioctl_clone_range_args)
-
#define BTRFS_IOC_SUBVOL_CREATE _IOW(BTRFS_IOCTL_MAGIC, 14, \
struct btrfs_ioctl_vol_args)
#define BTRFS_IOC_SNAP_DESTROY _IOW(BTRFS_IOCTL_MAGIC, 15, \
--
1.8.3.1
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH-RFC-RESEND 2/9] vfs/btrfs: add .clone_range file operation
2015-08-26 8:16 ` [PATCH-RFC-RESEND 2/9] vfs/btrfs: add .clone_range file operation Peng Tao
@ 2015-08-26 13:17 ` Christoph Hellwig
2015-08-27 6:15 ` Peng Tao
0 siblings, 1 reply; 22+ messages in thread
From: Christoph Hellwig @ 2015-08-26 13:17 UTC (permalink / raw)
To: Peng Tao
Cc: linux-fsdevel, Trond Myklebust, Anna Schumaker, Zach Brown,
Darren Hart, bfields, Jeff Layton, linux-nfs, Darrick J. Wong,
linux-btrfs
You need to eep the existing defintion, either as a duplicate with a
comment, or by including the uapi fs.h and defininin it to the new name.
Also this will break cifs, which uses the BTRFS_IOC_CLONE defintion
already.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH-RFC-RESEND 2/9] vfs/btrfs: add .clone_range file operation
2015-08-26 13:17 ` Christoph Hellwig
@ 2015-08-27 6:15 ` Peng Tao
0 siblings, 0 replies; 22+ messages in thread
From: Peng Tao @ 2015-08-27 6:15 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Devel FS Linux, Trond Myklebust, Anna Schumaker, Zach Brown,
Darren Hart, Bruce Fields, Jeff Layton, Linux NFS Mailing List,
Darrick J. Wong, linux-btrfs
On Wed, Aug 26, 2015 at 9:17 PM, Christoph Hellwig <hch@infradead.org> wrote:
> You need to eep the existing defintion, either as a duplicate with a
> comment, or by including the uapi fs.h and defininin it to the new name.
OK. will keep it.
>
> Also this will break cifs, which uses the BTRFS_IOC_CLONE defintion
> already.
I'll look into it as well.
Thanks,
Tao
^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH-RFC-RESEND 3/9] nfs42: add CLONE xdr functions
2015-08-26 8:16 [PATCH-RFC-RESEND 0/9] NFS/NFSD: add NFSv42 CLONE operation support Peng Tao
2015-08-26 8:16 ` [PATCH-RFC-RESEND 1/9] vfs: pull btrfs clone API to vfs layer Peng Tao
2015-08-26 8:16 ` [PATCH-RFC-RESEND 2/9] vfs/btrfs: add .clone_range file operation Peng Tao
@ 2015-08-26 8:16 ` Peng Tao
2015-08-26 8:16 ` [PATCH-RFC-RESEND 4/9] nfs42: add CLONE proc functions Peng Tao
` (5 subsequent siblings)
8 siblings, 0 replies; 22+ messages in thread
From: Peng Tao @ 2015-08-26 8:16 UTC (permalink / raw)
To: linux-fsdevel
Cc: Trond Myklebust, Anna Schumaker, Christoph Hellwig, Zach Brown,
Darren Hart, bfields, Jeff Layton, linux-nfs, Darrick J. Wong,
linux-btrfs, Peng Tao
xdr definitions per draft-ietf-nfsv4-minorversion2-38.txt
Signed-off-by: Peng Tao <tao.peng@primarydata.com>
---
fs/nfs/nfs42xdr.c | 97 ++++++++++++++++++++++++++++++++++++++++++++++++-
fs/nfs/nfs4xdr.c | 1 +
include/linux/nfs4.h | 2 +
include/linux/nfs_xdr.h | 19 ++++++++++
4 files changed, 118 insertions(+), 1 deletion(-)
diff --git a/fs/nfs/nfs42xdr.c b/fs/nfs/nfs42xdr.c
index a6bd27d..a761421 100644
--- a/fs/nfs/nfs42xdr.c
+++ b/fs/nfs/nfs42xdr.c
@@ -34,6 +34,12 @@
1 /* opaque devaddr4 length */ + \
XDR_QUADLEN(PNFS_LAYOUTSTATS_MAXSIZE))
#define decode_layoutstats_maxsz (op_decode_hdr_maxsz)
+#define encode_clone_maxsz (encode_stateid_maxsz + \
+ encode_stateid_maxsz + \
+ 2 /* src offset */ + \
+ 2 /* dst offset */ + \
+ 2 /* count */)
+#define decode_clone_maxsz (op_decode_hdr_maxsz)
#define NFS4_enc_allocate_sz (compound_encode_hdr_maxsz + \
encode_putfh_maxsz + \
@@ -65,7 +71,20 @@
decode_sequence_maxsz + \
decode_putfh_maxsz + \
PNFS_LAYOUTSTATS_MAXDEV * decode_layoutstats_maxsz)
-
+#define NFS4_enc_clone_sz (compound_encode_hdr_maxsz + \
+ encode_sequence_maxsz + \
+ encode_putfh_maxsz + \
+ encode_savefh_maxsz + \
+ encode_putfh_maxsz + \
+ encode_clone_maxsz + \
+ encode_getattr_maxsz)
+#define NFS4_dec_clone_sz (compound_decode_hdr_maxsz + \
+ decode_sequence_maxsz + \
+ decode_putfh_maxsz + \
+ decode_savefh_maxsz + \
+ decode_putfh_maxsz + \
+ decode_clone_maxsz + \
+ decode_getattr_maxsz)
static void encode_fallocate(struct xdr_stream *xdr,
struct nfs42_falloc_args *args)
@@ -128,6 +147,21 @@ static void encode_layoutstats(struct xdr_stream *xdr,
encode_uint32(xdr, 0);
}
+static void encode_clone(struct xdr_stream *xdr,
+ struct nfs42_clone_args *args,
+ struct compound_hdr *hdr)
+{
+ __be32 *p;
+
+ encode_op_hdr(xdr, OP_CLONE, decode_clone_maxsz, hdr);
+ encode_nfs4_stateid(xdr, &args->src_stateid);
+ encode_nfs4_stateid(xdr, &args->dst_stateid);
+ p = reserve_space(xdr, 3*8);
+ p = xdr_encode_hyper(p, args->src_offset);
+ p = xdr_encode_hyper(p, args->dst_offset);
+ xdr_encode_hyper(p, args->count);
+}
+
/*
* Encode ALLOCATE request
*/
@@ -206,6 +240,27 @@ static void nfs4_xdr_enc_layoutstats(struct rpc_rqst *req,
encode_nops(&hdr);
}
+/*
+ * Encode CLONE request
+ */
+static void nfs4_xdr_enc_clone(struct rpc_rqst *req,
+ struct xdr_stream *xdr,
+ struct nfs42_clone_args *args)
+{
+ struct compound_hdr hdr = {
+ .minorversion = nfs4_xdr_minorversion(&args->seq_args),
+ };
+
+ encode_compound_hdr(xdr, req, &hdr);
+ encode_sequence(xdr, &args->seq_args, &hdr);
+ encode_putfh(xdr, args->src_fh, &hdr);
+ encode_savefh(xdr, &hdr);
+ encode_putfh(xdr, args->dst_fh, &hdr);
+ encode_clone(xdr, args, &hdr);
+ encode_getfattr(xdr, args->dst_bitmask, &hdr);
+ encode_nops(&hdr);
+}
+
static int decode_allocate(struct xdr_stream *xdr, struct nfs42_falloc_res *res)
{
return decode_op_hdr(xdr, OP_ALLOCATE);
@@ -244,6 +299,11 @@ static int decode_layoutstats(struct xdr_stream *xdr,
return decode_op_hdr(xdr, OP_LAYOUTSTATS);
}
+static int decode_clone(struct xdr_stream *xdr)
+{
+ return decode_op_hdr(xdr, OP_CLONE);
+}
+
/*
* Decode ALLOCATE request
*/
@@ -352,4 +412,39 @@ out:
return status;
}
+/*
+ * Decode CLONE request
+ */
+static int nfs4_xdr_dec_clone(struct rpc_rqst *rqstp,
+ struct xdr_stream *xdr,
+ struct nfs42_clone_res *res)
+{
+ struct compound_hdr hdr;
+ int status;
+
+ status = decode_compound_hdr(xdr, &hdr);
+ if (status)
+ goto out;
+ status = decode_sequence(xdr, &res->seq_res, rqstp);
+ if (status)
+ goto out;
+ status = decode_putfh(xdr);
+ if (status)
+ goto out;
+ status = decode_savefh(xdr);
+ if (status)
+ goto out;
+ status = decode_putfh(xdr);
+ if (status)
+ goto out;
+ status = decode_clone(xdr);
+ if (status)
+ goto out;
+ status = decode_getfattr(xdr, res->dst_fattr, res->server);
+
+out:
+ res->rpc_status = status;
+ return status;
+}
+
#endif /* __LINUX_FS_NFS_NFS4_2XDR_H */
diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
index c42459e..a453bd1 100644
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -7434,6 +7434,7 @@ struct rpc_procinfo nfs4_procedures[] = {
PROC(ALLOCATE, enc_allocate, dec_allocate),
PROC(DEALLOCATE, enc_deallocate, dec_deallocate),
PROC(LAYOUTSTATS, enc_layoutstats, dec_layoutstats),
+ PROC(CLONE, enc_clone, dec_clone),
#endif /* CONFIG_NFS_V4_2 */
};
diff --git a/include/linux/nfs4.h b/include/linux/nfs4.h
index 00121f2..c0c695b 100644
--- a/include/linux/nfs4.h
+++ b/include/linux/nfs4.h
@@ -130,6 +130,7 @@ enum nfs_opnum4 {
OP_READ_PLUS = 68,
OP_SEEK = 69,
OP_WRITE_SAME = 70,
+ OP_CLONE = 71,
OP_ILLEGAL = 10044,
};
@@ -501,6 +502,7 @@ enum {
NFSPROC4_CLNT_ALLOCATE,
NFSPROC4_CLNT_DEALLOCATE,
NFSPROC4_CLNT_LAYOUTSTATS,
+ NFSPROC4_CLNT_CLONE,
};
/* nfs41 types */
diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
index b9b5304..79918d6 100644
--- a/include/linux/nfs_xdr.h
+++ b/include/linux/nfs_xdr.h
@@ -359,6 +359,25 @@ struct nfs42_layoutstat_data {
struct nfs42_layoutstat_res res;
};
+struct nfs42_clone_args {
+ struct nfs4_sequence_args seq_args;
+ struct nfs_fh *src_fh;
+ struct nfs_fh *dst_fh;
+ nfs4_stateid src_stateid;
+ nfs4_stateid dst_stateid;
+ __u64 src_offset;
+ __u64 dst_offset;
+ __u64 count;
+ const u32 *dst_bitmask;
+};
+
+struct nfs42_clone_res {
+ struct nfs4_sequence_res seq_res;
+ unsigned int rpc_status;
+ struct nfs_fattr *dst_fattr;
+ const struct nfs_server *server;
+};
+
struct stateowner_id {
__u64 create_time;
__u32 uniquifier;
--
1.8.3.1
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH-RFC-RESEND 4/9] nfs42: add CLONE proc functions
2015-08-26 8:16 [PATCH-RFC-RESEND 0/9] NFS/NFSD: add NFSv42 CLONE operation support Peng Tao
` (2 preceding siblings ...)
2015-08-26 8:16 ` [PATCH-RFC-RESEND 3/9] nfs42: add CLONE xdr functions Peng Tao
@ 2015-08-26 8:16 ` Peng Tao
2015-08-26 8:16 ` [PATCH-RFC-RESEND 5/9] nfs42: add .copy_range file operation Peng Tao
` (4 subsequent siblings)
8 siblings, 0 replies; 22+ messages in thread
From: Peng Tao @ 2015-08-26 8:16 UTC (permalink / raw)
To: linux-fsdevel
Cc: Trond Myklebust, Anna Schumaker, Christoph Hellwig, Zach Brown,
Darren Hart, bfields, Jeff Layton, linux-nfs, Darrick J. Wong,
linux-btrfs, Peng Tao
Signed-off-by: Peng Tao <tao.peng@primarydata.com>
---
fs/nfs/nfs42.h | 1 +
fs/nfs/nfs42proc.c | 71 +++++++++++++++++++++++++++++++++++++++++++++++
fs/nfs/nfs4proc.c | 3 +-
include/linux/nfs_fs_sb.h | 1 +
4 files changed, 75 insertions(+), 1 deletion(-)
diff --git a/fs/nfs/nfs42.h b/fs/nfs/nfs42.h
index ff66ae7..3585f3c 100644
--- a/fs/nfs/nfs42.h
+++ b/fs/nfs/nfs42.h
@@ -17,6 +17,7 @@ int nfs42_proc_deallocate(struct file *, loff_t, loff_t);
loff_t nfs42_proc_llseek(struct file *, loff_t, int);
int nfs42_proc_layoutstats_generic(struct nfs_server *,
struct nfs42_layoutstat_data *);
+int nfs42_proc_clone(struct file *, struct file *, loff_t, loff_t, loff_t);
/* nfs4.2xdr.h */
extern struct rpc_procinfo nfs4_2_procedures[];
diff --git a/fs/nfs/nfs42proc.c b/fs/nfs/nfs42proc.c
index d731bbf..baa3141 100644
--- a/fs/nfs/nfs42proc.c
+++ b/fs/nfs/nfs42proc.c
@@ -269,3 +269,74 @@ int nfs42_proc_layoutstats_generic(struct nfs_server *server,
return PTR_ERR(task);
return 0;
}
+
+static int _nfs42_proc_clone(struct rpc_message *msg, struct file *src_f,
+ struct file *dst_f, loff_t src_offset,
+ loff_t dst_offset, loff_t count)
+{
+ struct inode *src_inode = file_inode(src_f);
+ struct inode *dst_inode = file_inode(dst_f);
+ struct nfs_server *server = NFS_SERVER(dst_inode);
+ struct nfs42_clone_args args = {
+ .src_fh = NFS_FH(src_inode),
+ .dst_fh = NFS_FH(dst_inode),
+ .src_offset = src_offset,
+ .dst_offset = dst_offset,
+ .dst_bitmask = server->cache_consistency_bitmask,
+ };
+ struct nfs42_clone_res res = {
+ .server = server,
+ };
+ int status;
+
+ msg->rpc_argp = &args;
+ msg->rpc_resp = &res;
+
+ status = nfs42_set_rw_stateid(&args.src_stateid, src_f, FMODE_READ);
+ if (status)
+ return status;
+
+ status = nfs42_set_rw_stateid(&args.dst_stateid, dst_f, FMODE_WRITE);
+ if (status)
+ return status;
+
+ res.dst_fattr = nfs_alloc_fattr();
+ if (!res.dst_fattr)
+ return -ENOMEM;
+
+ status = nfs4_call_sync(server->client, server, msg,
+ &args.seq_args, &res.seq_res, 0);
+ if (status == 0)
+ status = nfs_post_op_update_inode(dst_inode, res.dst_fattr);
+
+ kfree(res.dst_fattr);
+ return status;
+}
+
+int nfs42_proc_clone(struct file *src_f, struct file *dst_f,
+ loff_t src_offset, loff_t dst_offset, loff_t count)
+{
+ struct rpc_message msg = {
+ .rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_CLONE],
+ };
+ struct inode *inode = file_inode(src_f);
+ struct nfs_server *server = NFS_SERVER(file_inode(src_f));
+ struct nfs4_exception exception = { };
+ int err;
+
+ if (!nfs_server_capable(inode, NFS_CAP_CLONE))
+ return -EOPNOTSUPP;
+
+ do {
+ err = _nfs42_proc_clone(&msg, src_f, dst_f, src_offset,
+ dst_offset, count);
+ if (err == -ENOTSUPP || err == -EOPNOTSUPP) {
+ NFS_SERVER(inode)->caps &= ~NFS_CAP_CLONE;
+ return -EOPNOTSUPP;
+ }
+ err = nfs4_handle_exception(server, err, &exception);
+ } while (exception.retry);
+
+ return err;
+
+}
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 6e988fd..fb51b75 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -8659,7 +8659,8 @@ static const struct nfs4_minor_version_ops nfs_v4_2_minor_ops = {
| NFS_CAP_ALLOCATE
| NFS_CAP_DEALLOCATE
| NFS_CAP_SEEK
- | NFS_CAP_LAYOUTSTATS,
+ | NFS_CAP_LAYOUTSTATS
+ | NFS_CAP_CLONE,
.init_client = nfs41_init_client,
.shutdown_client = nfs41_shutdown_client,
.match_stateid = nfs41_match_stateid,
diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h
index 20bc8e5..2fb163a 100644
--- a/include/linux/nfs_fs_sb.h
+++ b/include/linux/nfs_fs_sb.h
@@ -238,5 +238,6 @@ struct nfs_server {
#define NFS_CAP_ALLOCATE (1U << 20)
#define NFS_CAP_DEALLOCATE (1U << 21)
#define NFS_CAP_LAYOUTSTATS (1U << 22)
+#define NFS_CAP_CLONE (1U << 23)
#endif
--
1.8.3.1
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH-RFC-RESEND 5/9] nfs42: add .copy_range file operation
2015-08-26 8:16 [PATCH-RFC-RESEND 0/9] NFS/NFSD: add NFSv42 CLONE operation support Peng Tao
` (3 preceding siblings ...)
2015-08-26 8:16 ` [PATCH-RFC-RESEND 4/9] nfs42: add CLONE proc functions Peng Tao
@ 2015-08-26 8:16 ` Peng Tao
[not found] ` <1440577010-122867-6-git-send-email-tao.peng-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>
2015-08-26 8:16 ` [PATCH-RFC-RESEND 6/9] nfs: get clone_blksize when probing fsinfo Peng Tao
` (3 subsequent siblings)
8 siblings, 1 reply; 22+ messages in thread
From: Peng Tao @ 2015-08-26 8:16 UTC (permalink / raw)
To: linux-fsdevel
Cc: Trond Myklebust, Anna Schumaker, Christoph Hellwig, Zach Brown,
Darren Hart, bfields, Jeff Layton, linux-nfs, Darrick J. Wong,
linux-btrfs, Peng Tao
Signed-off-by: Peng Tao <tao.peng@primarydata.com>
---
fs/nfs/nfs4file.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 50 insertions(+)
diff --git a/fs/nfs/nfs4file.c b/fs/nfs/nfs4file.c
index dcd39d4..c335cb0 100644
--- a/fs/nfs/nfs4file.c
+++ b/fs/nfs/nfs4file.c
@@ -4,6 +4,7 @@
* Copyright (C) 1992 Rick Sladkey
*/
#include <linux/fs.h>
+#include <linux/file.h>
#include <linux/falloc.h>
#include <linux/nfs_fs.h>
#include "internal.h"
@@ -166,6 +167,54 @@ static long nfs42_fallocate(struct file *filep, int mode, loff_t offset, loff_t
return nfs42_proc_deallocate(filep, offset, len);
return nfs42_proc_allocate(filep, offset, len);
}
+
+static noinline int
+nfs42_file_clone_range(struct file *src_file, struct file *dst_file,
+ loff_t src_off, size_t dst_off, loff_t count)
+{
+ struct inode *dst_inode = file_inode(dst_file);
+ struct inode *src_inode = file_inode(src_file);
+ int ret;
+
+ /* src and dst must be different files */
+ if (src_inode == dst_inode)
+ return -EINVAL;
+
+ /* XXX: do we lock at all? what if server needs CB_RECALL_LAYOUT? */
+ if (dst_inode < src_inode) {
+ mutex_lock_nested(&dst_inode->i_mutex, I_MUTEX_PARENT);
+ mutex_lock_nested(&src_inode->i_mutex, I_MUTEX_CHILD);
+ } else {
+ mutex_lock_nested(&src_inode->i_mutex, I_MUTEX_PARENT);
+ mutex_lock_nested(&dst_inode->i_mutex, I_MUTEX_CHILD);
+ }
+
+ /* flush all pending writes on both src and dst so that server
+ * has the latest data */
+ ret = nfs_sync_inode(src_inode);
+ if (ret)
+ goto out_unlock;
+ ret = nfs_sync_inode(dst_inode);
+ if (ret)
+ goto out_unlock;
+
+ ret = nfs42_proc_clone(src_file, dst_file, src_off, dst_off, count);
+
+ /* truncate inode page cache of the dst range so that future reads can fetch
+ * new data from server */
+ if (!ret)
+ truncate_inode_pages_range(&dst_inode->i_data, dst_off, dst_off + count - 1);
+
+out_unlock:
+ if (dst_inode < src_inode) {
+ mutex_unlock(&src_inode->i_mutex);
+ mutex_unlock(&dst_inode->i_mutex);
+ } else {
+ mutex_unlock(&dst_inode->i_mutex);
+ mutex_unlock(&src_inode->i_mutex);
+ }
+ return ret;
+}
#endif /* CONFIG_NFS_V4_2 */
const struct file_operations nfs4_file_operations = {
@@ -187,6 +236,7 @@ const struct file_operations nfs4_file_operations = {
.splice_write = iter_file_splice_write,
#ifdef CONFIG_NFS_V4_2
.fallocate = nfs42_fallocate,
+ .clone_range = nfs42_file_clone_range,
#endif /* CONFIG_NFS_V4_2 */
.check_flags = nfs_check_flags,
.setlease = simple_nosetlease,
--
1.8.3.1
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH-RFC-RESEND 6/9] nfs: get clone_blksize when probing fsinfo
2015-08-26 8:16 [PATCH-RFC-RESEND 0/9] NFS/NFSD: add NFSv42 CLONE operation support Peng Tao
` (4 preceding siblings ...)
2015-08-26 8:16 ` [PATCH-RFC-RESEND 5/9] nfs42: add .copy_range file operation Peng Tao
@ 2015-08-26 8:16 ` Peng Tao
2015-08-26 8:16 ` [PATCH-RFC-RESEND 7/9] nfs42: respect clone_blksize Peng Tao
` (2 subsequent siblings)
8 siblings, 0 replies; 22+ messages in thread
From: Peng Tao @ 2015-08-26 8:16 UTC (permalink / raw)
To: linux-fsdevel
Cc: Trond Myklebust, Anna Schumaker, Christoph Hellwig, Zach Brown,
Darren Hart, bfields, Jeff Layton, linux-nfs, Darrick J. Wong,
linux-btrfs, Peng Tao
NFSv42 CLONE operation is supposed to respect it.
Signed-off-by: Peng Tao <tao.peng@primarydata.com>
---
fs/nfs/client.c | 1 +
fs/nfs/nfs4proc.c | 1 +
fs/nfs/nfs4xdr.c | 25 +++++++++++++++++++++++++
include/linux/nfs4.h | 1 +
include/linux/nfs_fs_sb.h | 1 +
include/linux/nfs_xdr.h | 1 +
6 files changed, 30 insertions(+)
diff --git a/fs/nfs/client.c b/fs/nfs/client.c
index 57c5a02..d6d5d2a 100644
--- a/fs/nfs/client.c
+++ b/fs/nfs/client.c
@@ -764,6 +764,7 @@ static void nfs_server_set_fsinfo(struct nfs_server *server,
server->time_delta = fsinfo->time_delta;
+ server->clone_blksize = fsinfo->clone_blksize;
/* We're airborne Set socket buffersize */
rpc_setbufsize(server->client, server->wsize + 100, server->rsize + 100);
}
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index fb51b75..ecf792c 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -239,6 +239,7 @@ const u32 nfs4_fsinfo_bitmap[3] = { FATTR4_WORD0_MAXFILESIZE
FATTR4_WORD1_TIME_DELTA
| FATTR4_WORD1_FS_LAYOUT_TYPES,
FATTR4_WORD2_LAYOUT_BLKSIZE
+ | FATTR4_WORD2_CLONE_BLKSIZE
};
const u32 nfs4_fs_locations_bitmap[3] = {
diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
index a453bd1..f0dce533 100644
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -4737,6 +4737,28 @@ static int decode_attr_layout_blksize(struct xdr_stream *xdr, uint32_t *bitmap,
return 0;
}
+/*
+ * The granularity of a CLONE operation.
+ */
+static int decode_attr_clone_blksize(struct xdr_stream *xdr, uint32_t *bitmap,
+ uint32_t *res)
+{
+ __be32 *p;
+
+ dprintk("%s: bitmap is %x\n", __func__, bitmap[2]);
+ *res = 0;
+ if (bitmap[2] & FATTR4_WORD2_CLONE_BLKSIZE) {
+ p = xdr_inline_decode(xdr, 4);
+ if (unlikely(!p)) {
+ print_overflow_msg(__func__, xdr);
+ return -EIO;
+ }
+ *res = be32_to_cpup(p);
+ bitmap[2] &= ~FATTR4_WORD2_CLONE_BLKSIZE;
+ }
+ return 0;
+}
+
static int decode_fsinfo(struct xdr_stream *xdr, struct nfs_fsinfo *fsinfo)
{
unsigned int savep;
@@ -4771,6 +4793,9 @@ static int decode_fsinfo(struct xdr_stream *xdr, struct nfs_fsinfo *fsinfo)
status = decode_attr_layout_blksize(xdr, bitmap, &fsinfo->blksize);
if (status)
goto xdr_error;
+ status = decode_attr_clone_blksize(xdr, bitmap, &fsinfo->clone_blksize);
+ if (status)
+ goto xdr_error;
status = verify_attr_len(xdr, savep, attrlen);
xdr_error:
diff --git a/include/linux/nfs4.h b/include/linux/nfs4.h
index c0c695b..e7e7853 100644
--- a/include/linux/nfs4.h
+++ b/include/linux/nfs4.h
@@ -422,6 +422,7 @@ enum lock_type4 {
#define FATTR4_WORD2_LAYOUT_TYPES (1UL << 0)
#define FATTR4_WORD2_LAYOUT_BLKSIZE (1UL << 1)
#define FATTR4_WORD2_MDSTHRESHOLD (1UL << 4)
+#define FATTR4_WORD2_CLONE_BLKSIZE (1UL << 13)
#define FATTR4_WORD2_SECURITY_LABEL (1UL << 16)
/* MDS threshold bitmap bits */
diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h
index 2fb163a..db00311 100644
--- a/include/linux/nfs_fs_sb.h
+++ b/include/linux/nfs_fs_sb.h
@@ -147,6 +147,7 @@ struct nfs_server {
unsigned int acdirmax;
unsigned int namelen;
unsigned int options; /* extra options enabled by mount */
+ unsigned int clone_blksize; /* granularity of a CLONE operation */
#define NFS_OPTION_FSCACHE 0x00000001 /* - local caching enabled */
#define NFS_OPTION_MIGRATION 0x00000002 /* - NFSv4 migration enabled */
diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
index 79918d6..e89dafd 100644
--- a/include/linux/nfs_xdr.h
+++ b/include/linux/nfs_xdr.h
@@ -141,6 +141,7 @@ struct nfs_fsinfo {
__u32 lease_time; /* in seconds */
__u32 layouttype; /* supported pnfs layout driver */
__u32 blksize; /* preferred pnfs io block size */
+ __u32 clone_blksize; /* granularity of a CLONE operation */
};
struct nfs_fsstat {
--
1.8.3.1
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH-RFC-RESEND 7/9] nfs42: respect clone_blksize
2015-08-26 8:16 [PATCH-RFC-RESEND 0/9] NFS/NFSD: add NFSv42 CLONE operation support Peng Tao
` (5 preceding siblings ...)
2015-08-26 8:16 ` [PATCH-RFC-RESEND 6/9] nfs: get clone_blksize when probing fsinfo Peng Tao
@ 2015-08-26 8:16 ` Peng Tao
2015-08-26 8:16 ` [PATCH-RFC-RESEND 8/9] nfsd: Pass filehandle to nfs4_preprocess_stateid_op() Peng Tao
[not found] ` <1440577010-122867-1-git-send-email-tao.peng-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>
8 siblings, 0 replies; 22+ messages in thread
From: Peng Tao @ 2015-08-26 8:16 UTC (permalink / raw)
To: linux-fsdevel
Cc: Trond Myklebust, Anna Schumaker, Christoph Hellwig, Zach Brown,
Darren Hart, bfields, Jeff Layton, linux-nfs, Darrick J. Wong,
linux-btrfs, Peng Tao
draft-ietf-nfsv4-minorversion2-38.txt says:
Both cl_src_offset and
cl_dst_offset must be aligned to the clone block size Section 12.2.1.
The number of bytes to be cloned must be a multiple of the clone
block size, except in the case in which cl_src_offset plus the number
of bytes to be cloned is equal to the source file size.
Signed-off-by: Peng Tao <tao.peng@primarydata.com>
---
fs/nfs/nfs4file.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/fs/nfs/nfs4file.c b/fs/nfs/nfs4file.c
index c335cb0..ee8c014 100644
--- a/fs/nfs/nfs4file.c
+++ b/fs/nfs/nfs4file.c
@@ -174,12 +174,20 @@ nfs42_file_clone_range(struct file *src_file, struct file *dst_file,
{
struct inode *dst_inode = file_inode(dst_file);
struct inode *src_inode = file_inode(src_file);
+ struct nfs_server *server = NFS_SERVER(dst_inode);
+ unsigned int bs = server->clone_blksize;
int ret;
/* src and dst must be different files */
if (src_inode == dst_inode)
return -EINVAL;
+ /* check alignment w.r.t. clone_blksize */
+ if (bs)
+ if (!IS_ALIGNED(src_off, bs) || !IS_ALIGNED(dst_off, bs) ||
+ (!IS_ALIGNED(count, bs) && i_size_read(src_inode) != (src_off + count)))
+ return -EINVAL;
+
/* XXX: do we lock at all? what if server needs CB_RECALL_LAYOUT? */
if (dst_inode < src_inode) {
mutex_lock_nested(&dst_inode->i_mutex, I_MUTEX_PARENT);
--
1.8.3.1
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH-RFC-RESEND 8/9] nfsd: Pass filehandle to nfs4_preprocess_stateid_op()
2015-08-26 8:16 [PATCH-RFC-RESEND 0/9] NFS/NFSD: add NFSv42 CLONE operation support Peng Tao
` (6 preceding siblings ...)
2015-08-26 8:16 ` [PATCH-RFC-RESEND 7/9] nfs42: respect clone_blksize Peng Tao
@ 2015-08-26 8:16 ` Peng Tao
[not found] ` <1440577010-122867-1-git-send-email-tao.peng-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>
8 siblings, 0 replies; 22+ messages in thread
From: Peng Tao @ 2015-08-26 8:16 UTC (permalink / raw)
To: linux-fsdevel
Cc: Trond Myklebust, Anna Schumaker, Christoph Hellwig, Zach Brown,
Darren Hart, bfields, Jeff Layton, linux-nfs, Darrick J. Wong,
linux-btrfs, Anna Schumaker, Anna Schumaker
From: Anna Schumaker <Anna.Schumaker@netapp.com>
This will be needed so COPY can look up the saved_fh in addition to the
current_fh.
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
---
fs/nfsd/nfs4proc.c | 16 +++++++++-------
fs/nfsd/nfs4state.c | 5 ++---
fs/nfsd/state.h | 4 ++--
3 files changed, 13 insertions(+), 12 deletions(-)
diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index 90cfda7..d34c967 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -776,8 +776,9 @@ nfsd4_read(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
clear_bit(RQ_SPLICE_OK, &rqstp->rq_flags);
/* check stateid */
- status = nfs4_preprocess_stateid_op(rqstp, cstate, &read->rd_stateid,
- RD_STATE, &read->rd_filp, &read->rd_tmp_file);
+ status = nfs4_preprocess_stateid_op(rqstp, cstate, &cstate->current_fh,
+ &read->rd_stateid, RD_STATE,
+ &read->rd_filp, &read->rd_tmp_file);
if (status) {
dprintk("NFSD: nfsd4_read: couldn't process stateid!\n");
goto out;
@@ -923,7 +924,8 @@ nfsd4_setattr(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
if (setattr->sa_iattr.ia_valid & ATTR_SIZE) {
status = nfs4_preprocess_stateid_op(rqstp, cstate,
- &setattr->sa_stateid, WR_STATE, NULL, NULL);
+ &cstate->current_fh, &setattr->sa_stateid,
+ WR_STATE, NULL, NULL);
if (status) {
dprintk("NFSD: nfsd4_setattr: couldn't process stateid!\n");
return status;
@@ -987,8 +989,8 @@ nfsd4_write(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
if (write->wr_offset >= OFFSET_MAX)
return nfserr_inval;
- status = nfs4_preprocess_stateid_op(rqstp, cstate, stateid, WR_STATE,
- &filp, NULL);
+ status = nfs4_preprocess_stateid_op(rqstp, cstate, &cstate->current_fh,
+ stateid, WR_STATE, &filp, NULL);
if (status) {
dprintk("NFSD: nfsd4_write: couldn't process stateid!\n");
return status;
@@ -1018,7 +1020,7 @@ nfsd4_fallocate(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
__be32 status = nfserr_notsupp;
struct file *file;
- status = nfs4_preprocess_stateid_op(rqstp, cstate,
+ status = nfs4_preprocess_stateid_op(rqstp, cstate, &cstate->current_fh,
&fallocate->falloc_stateid,
WR_STATE, &file, NULL);
if (status != nfs_ok) {
@@ -1057,7 +1059,7 @@ nfsd4_seek(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
__be32 status;
struct file *file;
- status = nfs4_preprocess_stateid_op(rqstp, cstate,
+ status = nfs4_preprocess_stateid_op(rqstp, cstate, &cstate->current_fh,
&seek->seek_stateid,
RD_STATE, &file, NULL);
if (status) {
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 61dfb33..7b0059d 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -4645,10 +4645,9 @@ nfs4_check_file(struct svc_rqst *rqstp, struct svc_fh *fhp, struct nfs4_stid *s,
*/
__be32
nfs4_preprocess_stateid_op(struct svc_rqst *rqstp,
- struct nfsd4_compound_state *cstate, stateid_t *stateid,
- int flags, struct file **filpp, bool *tmp_file)
+ struct nfsd4_compound_state *cstate, struct svc_fh *fhp,
+ stateid_t *stateid, int flags, struct file **filpp, bool *tmp_file)
{
- struct svc_fh *fhp = &cstate->current_fh;
struct inode *ino = d_inode(fhp->fh_dentry);
struct net *net = SVC_NET(rqstp);
struct nfsd_net *nn = net_generic(net, nfsd_net_id);
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 4874ce5..d3e81ce 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -584,8 +584,8 @@ struct nfsd4_compound_state;
struct nfsd_net;
extern __be32 nfs4_preprocess_stateid_op(struct svc_rqst *rqstp,
- struct nfsd4_compound_state *cstate, stateid_t *stateid,
- int flags, struct file **filp, bool *tmp_file);
+ struct nfsd4_compound_state *cstate, struct svc_fh *fhp,
+ stateid_t *stateid, int flags, struct file **filp, bool *tmp_file);
__be32 nfsd4_lookup_stateid(struct nfsd4_compound_state *cstate,
stateid_t *stateid, unsigned char typemask,
struct nfs4_stid **s, struct nfsd_net *nn);
--
1.8.3.1
^ permalink raw reply related [flat|nested] 22+ messages in thread
[parent not found: <1440577010-122867-1-git-send-email-tao.peng-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>]
* [PATCH-RFC-RESEND 9/9] NFSD: Implement the CLONE call
[not found] ` <1440577010-122867-1-git-send-email-tao.peng-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>
@ 2015-08-26 8:16 ` Peng Tao
2015-08-26 13:11 ` [PATCH-RFC-RESEND 0/9] NFS/NFSD: add NFSv42 CLONE operation support Christoph Hellwig
1 sibling, 0 replies; 22+ messages in thread
From: Peng Tao @ 2015-08-26 8:16 UTC (permalink / raw)
To: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA
Cc: Trond Myklebust, Anna Schumaker, Christoph Hellwig, Zach Brown,
Darren Hart, bfields-uC3wQj2KruNg9hUCZPvPmw, Jeff Layton,
linux-nfs-u79uwXL29TY76Z2rM5mHXA, Darrick J. Wong,
linux-btrfs-u79uwXL29TY76Z2rM5mHXA, Anna Schumaker,
Christoph Hellwig, Peng Tao
From: Anna Schumaker <bjschuma-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org>
I can simply call vfs_file_clone_range() and have the vfs do the
right thing for the filesystem being exported.
Signed-off-by: Anna Schumaker <bjschuma-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org>
[hch: change to implement the CLONE op instead of COPY]
Signed-off-by: Christoph Hellwig <hch-jcswGhMUV9g@public.gmane.org>
Signed-off-by: Peng Tao <tao.peng-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>
---
fs/nfsd/nfs4proc.c | 79 ++++++++++++++++++++++++++++++++++++++++++++++++++++
fs/nfsd/nfs4xdr.c | 21 ++++++++++++++
fs/nfsd/vfs.c | 7 +++++
fs/nfsd/vfs.h | 1 +
fs/nfsd/xdr4.h | 10 +++++++
include/linux/nfs4.h | 4 +--
6 files changed, 120 insertions(+), 2 deletions(-)
diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index d34c967..c2d3558 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -1014,6 +1014,79 @@ nfsd4_write(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
}
static __be32
+nfsd4_verify_clone(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
+ struct nfsd4_clone *clone, struct file **src, struct file **dst)
+{
+ struct inode *src_ino, *dst_ino;
+ __be32 status;
+
+ status = nfs4_preprocess_stateid_op(rqstp, cstate, &cstate->save_fh,
+ &clone->cl_src_stateid, RD_STATE,
+ src, NULL);
+ if (status) {
+ dprintk("NFSD: %s: couldn't process src stateid!\n", __func__);
+ return status;
+ }
+
+ status = nfs4_preprocess_stateid_op(rqstp, cstate, &cstate->current_fh,
+ &clone->cl_dst_stateid, WR_STATE,
+ dst, NULL);
+ if (status) {
+ dprintk("NFSD: %s: couldn't process dst stateid!\n", __func__);
+ fput(*src);
+ }
+
+ /* a few extra check to make sure we send back proper errors per RFC */
+ src_ino = file_inode(*src);
+ dst_ino = file_inode(*dst);
+
+ if (S_ISDIR(src_ino->i_mode) || S_ISDIR(dst_ino->i_mode)) {
+ status = nfserr_wrong_type;
+ goto out_fput;
+ }
+
+ if (src_ino == dst_ino) {
+ status = nfserr_inval;
+ goto out_fput;
+ }
+
+ if (!(*src)->f_op || !(*src)->f_op->clone_range) {
+ status = nfserr_notsupp;
+ goto out_fput;
+ }
+out:
+ return status;
+out_fput:
+ fput(*src);
+ fput(*dst);
+ goto out;
+}
+
+static __be32
+nfsd4_clone(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
+ struct nfsd4_clone *clone)
+{
+ int ret;
+ __be32 status;
+ struct file *src = NULL, *dst = NULL;
+
+ status = nfsd4_verify_clone(rqstp, cstate, clone, &src, &dst);
+ if (status)
+ return status;
+
+ ret = nfsd4_clone_range(src, dst, clone->cl_src_pos,
+ clone->cl_count, clone->cl_dst_pos);
+ if (ret < 0)
+ status = nfserrno(ret);
+ else
+ status = nfs_ok;
+
+ fput(src);
+ fput(dst);
+ return status;
+}
+
+static __be32
nfsd4_fallocate(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
struct nfsd4_fallocate *fallocate, int flags)
{
@@ -2283,6 +2356,12 @@ static struct nfsd4_operation nfsd4_ops[] = {
.op_name = "OP_DEALLOCATE",
.op_rsize_bop = (nfsd4op_rsize)nfsd4_only_status_rsize,
},
+ [OP_CLONE] = {
+ .op_func = (nfsd4op_func)nfsd4_clone,
+ .op_flags = OP_MODIFIES_SOMETHING | OP_CACHEME,
+ .op_name = "OP_CLONE",
+ .op_rsize_bop = (nfsd4op_rsize)nfsd4_only_status_rsize,
+ },
[OP_SEEK] = {
.op_func = (nfsd4op_func)nfsd4_seek,
.op_name = "OP_SEEK",
diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 5463385..a845db1 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -1675,6 +1675,25 @@ nfsd4_decode_fallocate(struct nfsd4_compoundargs *argp,
}
static __be32
+nfsd4_decode_clone(struct nfsd4_compoundargs *argp, struct nfsd4_clone *clone)
+{
+ DECODE_HEAD;
+
+ status = nfsd4_decode_stateid(argp, &clone->cl_src_stateid);
+ if (status)
+ return status;
+ status = nfsd4_decode_stateid(argp, &clone->cl_dst_stateid);
+ if (status)
+ return status;
+
+ READ_BUF(8 + 8 + 8);
+ p = xdr_decode_hyper(p, &clone->cl_src_pos);
+ p = xdr_decode_hyper(p, &clone->cl_dst_pos);
+ p = xdr_decode_hyper(p, &clone->cl_count);
+ DECODE_TAIL;
+}
+
+static __be32
nfsd4_decode_seek(struct nfsd4_compoundargs *argp, struct nfsd4_seek *seek)
{
DECODE_HEAD;
@@ -1785,6 +1804,7 @@ static nfsd4_dec nfsd4_dec_ops[] = {
[OP_READ_PLUS] = (nfsd4_dec)nfsd4_decode_notsupp,
[OP_SEEK] = (nfsd4_dec)nfsd4_decode_seek,
[OP_WRITE_SAME] = (nfsd4_dec)nfsd4_decode_notsupp,
+ [OP_CLONE] = (nfsd4_dec)nfsd4_decode_clone,
};
static inline bool
@@ -4249,6 +4269,7 @@ static nfsd4_enc nfsd4_enc_ops[] = {
[OP_READ_PLUS] = (nfsd4_enc)nfsd4_encode_noop,
[OP_SEEK] = (nfsd4_enc)nfsd4_encode_seek,
[OP_WRITE_SAME] = (nfsd4_enc)nfsd4_encode_noop,
+ [OP_CLONE] = (nfsd4_enc)nfsd4_encode_noop,
};
/*
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index b5e077a..7d43097 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -36,6 +36,7 @@
#endif /* CONFIG_NFSD_V3 */
#ifdef CONFIG_NFSD_V4
+#include "../internal.h"
#include "acl.h"
#include "idmap.h"
#endif /* CONFIG_NFSD_V4 */
@@ -498,6 +499,12 @@ __be32 nfsd4_set_nfs4_label(struct svc_rqst *rqstp, struct svc_fh *fhp,
}
#endif
+int nfsd4_clone_range(struct file *src, struct file *dst, u64 src_pos,
+ u64 count, u64 dst_pos)
+{
+ return vfs_file_clone_range(src, dst, src_pos, count, dst_pos);
+}
+
__be32 nfsd4_vfs_fallocate(struct svc_rqst *rqstp, struct svc_fh *fhp,
struct file *file, loff_t offset, loff_t len,
int flags)
diff --git a/fs/nfsd/vfs.h b/fs/nfsd/vfs.h
index 5be875e..31d56a3 100644
--- a/fs/nfsd/vfs.h
+++ b/fs/nfsd/vfs.h
@@ -56,6 +56,7 @@ __be32 nfsd4_set_nfs4_label(struct svc_rqst *, struct svc_fh *,
struct xdr_netobj *);
__be32 nfsd4_vfs_fallocate(struct svc_rqst *, struct svc_fh *,
struct file *, loff_t, loff_t, int);
+int nfsd4_clone_range(struct file *, struct file *, u64, u64, u64);
#endif /* CONFIG_NFSD_V4 */
__be32 nfsd_create(struct svc_rqst *, struct svc_fh *,
char *name, int len, struct iattr *attrs,
diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
index 9f99100..3d70712 100644
--- a/fs/nfsd/xdr4.h
+++ b/fs/nfsd/xdr4.h
@@ -491,6 +491,15 @@ struct nfsd4_fallocate {
u64 falloc_length;
};
+struct nfsd4_clone {
+ /* request */
+ stateid_t cl_src_stateid;
+ stateid_t cl_dst_stateid;
+ u64 cl_src_pos;
+ u64 cl_dst_pos;
+ u64 cl_count;
+};
+
struct nfsd4_seek {
/* request */
stateid_t seek_stateid;
@@ -555,6 +564,7 @@ struct nfsd4_op {
/* NFSv4.2 */
struct nfsd4_fallocate allocate;
struct nfsd4_fallocate deallocate;
+ struct nfsd4_clone clone;
struct nfsd4_seek seek;
} u;
struct nfs4_replay * replay;
diff --git a/include/linux/nfs4.h b/include/linux/nfs4.h
index e7e7853..21c6612 100644
--- a/include/linux/nfs4.h
+++ b/include/linux/nfs4.h
@@ -139,10 +139,10 @@ enum nfs_opnum4 {
Needs to be updated if more operations are defined in future.*/
#define FIRST_NFS4_OP OP_ACCESS
-#define LAST_NFS4_OP OP_WRITE_SAME
#define LAST_NFS40_OP OP_RELEASE_LOCKOWNER
#define LAST_NFS41_OP OP_RECLAIM_COMPLETE
-#define LAST_NFS42_OP OP_WRITE_SAME
+#define LAST_NFS42_OP OP_CLONE
+#define LAST_NFS4_OP LAST_NFS42_OP
enum nfsstat4 {
NFS4_OK = 0,
--
1.8.3.1
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH-RFC-RESEND 0/9] NFS/NFSD: add NFSv42 CLONE operation support
[not found] ` <1440577010-122867-1-git-send-email-tao.peng-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>
2015-08-26 8:16 ` [PATCH-RFC-RESEND 9/9] NFSD: Implement the CLONE call Peng Tao
@ 2015-08-26 13:11 ` Christoph Hellwig
2015-08-27 6:24 ` Peng Tao
1 sibling, 1 reply; 22+ messages in thread
From: Christoph Hellwig @ 2015-08-26 13:11 UTC (permalink / raw)
To: Peng Tao
Cc: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Trond Myklebust,
Anna Schumaker, Christoph Hellwig, Zach Brown, Darren Hart,
bfields-uC3wQj2KruNg9hUCZPvPmw, Jeff Layton,
linux-nfs-u79uwXL29TY76Z2rM5mHXA, Darrick J. Wong,
linux-btrfs-u79uwXL29TY76Z2rM5mHXA
On Wed, Aug 26, 2015 at 04:16:41PM +0800, Peng Tao wrote:
> 2. vfs_file_clone_range() does not expect file systems to do data copy, and thus
> no rw_verify_area() required.
While there is no physical copy involved, the data logically is copied from
file A to file B. So we still need all the checks from rw_verify_area.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH-RFC-RESEND 0/9] NFS/NFSD: add NFSv42 CLONE operation support
2015-08-26 13:11 ` [PATCH-RFC-RESEND 0/9] NFS/NFSD: add NFSv42 CLONE operation support Christoph Hellwig
@ 2015-08-27 6:24 ` Peng Tao
0 siblings, 0 replies; 22+ messages in thread
From: Peng Tao @ 2015-08-27 6:24 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Devel FS Linux, Trond Myklebust, Anna Schumaker, Zach Brown,
Darren Hart, Bruce Fields, Jeff Layton, Linux NFS Mailing List,
Darrick J. Wong, linux-btrfs
On Wed, Aug 26, 2015 at 9:11 PM, Christoph Hellwig <hch@infradead.org> wrote:
> On Wed, Aug 26, 2015 at 04:16:41PM +0800, Peng Tao wrote:
>> 2. vfs_file_clone_range() does not expect file systems to do data copy, and thus
>> no rw_verify_area() required.
>
> While there is no physical copy involved, the data logically is copied from
> file A to file B. So we still need all the checks from rw_verify_area.
I see. Will add the checks.
Cheers,
Tao
^ permalink raw reply [flat|nested] 22+ messages in thread