[PATCH v4 0/6] ceph: kernel client cephfs quota support

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v4 0/6] ceph: kernel client cephfs quota support
@ 2018-01-05 10:47 Luis Henriques
  2018-01-05 10:47 ` [PATCH v4 1/6] ceph: quota: add initial infrastructure to support cephfs quotas Luis Henriques
                   ` (6 more replies)
  0 siblings, 7 replies; 11+ messages in thread
From: Luis Henriques @ 2018-01-05 10:47 UTC (permalink / raw)
  To: ceph-devel; +Cc: Yan, Zheng, Jeff Layton, Jan Fajerski, Luis Henriques

A cephfs-specific quota implementation has been available in the
user-space fuse client for a while.  This quota implementation allows an
administrator to restrict the number of bytes and/or the number of files
in a filesystem subtree.  This quota implementation, however, is
supported at the client-level only, which means that cooperation is
required between different clients accessing the system.

This obviously assumes that all clients are trusted entities and will
respect the quotas, preventing users from exceeding the quota limits.
Since the kernel client doesn't support quotas, it has not been possible
to use it in a cluster where quotas are a requirement.

This patchset adds kernel client support for cephfs quotas as it is
currently implemented in the ceph fuse client.  Note however that it
relies on some still-to-be-merged changes to the MDS (see below,
"Changes since v1" for details).

For further details on CephFS quota, see [1].

[1] http://docs.ceph.com/docs/master/cephfs/quota/

** Changes since v3 **

- Rework after review from Yan, Zheng:
  * ceph_handle_quota(): Always increment message sequence number, even
    if inode isn't in cache
  * renamed inode variables ino -> in
  * get_quota_realm() now returns a ceph_snap_realm instead of
    ceph_inode_info

- Updated quota.c copyright and added SPDX identifier

- Added max_bytes quota implementation

- Updated Documentation/filesystems/ceph.txt to include reference to
  quota; also documented added a few more comments to the code.

** Changes since v2 **

Rework after review from Yan, Zheng:

- Dropped patch 0001 ("ceph: add seqlock for snaprealm hierarchy change
  detection") and use mdsc->snap_rwsem for walking the snaprealm
  hierarchy instead of adding a seqlock.  This means that patches 0003
  and 0004 needed to be reworked.

- Added a NULL check in ceph_handle_quota() after the inode lookup with
  ceph_find_inode().

** Changes since v1 **

Instead of trying to do a reverse path walk to find the "quota realm"
for a given directory, this patchset is now using snaprealms.  Thus, for
testing it, a modified MDS is required:

  https://github.com/ukernel/ceph/tree/wip-cephfs-quota-realm

This modified MDS creates a snaprealm when a quota is set in a
directory.  This means that a client needs only to walk up the snaprealm
hierarchy to find a directory that has quotas instead of doing the full
reverse path walking.

Note however that this requires an extra patch that adds a seqlock (1st
patch in series) to detect changes in the snaprealm hierarchy.

Luis Henriques (6):
  ceph: quota: add initial infrastructure to support cephfs quotas
  ceph: quota: support for ceph.quota.max_files
  ceph: quota: don't allow cross-quota renames
  ceph: quota: support for ceph.quota.max_bytes
  ceph: quota: update MDS when max_bytes is approaching
  ceph: quota: add quotas to the in-tree cephfs documentation

 Documentation/filesystems/ceph.txt |  12 ++
 fs/ceph/Makefile                   |   2 +-
 fs/ceph/dir.c                      |  16 +++
 fs/ceph/file.c                     |  21 ++-
 fs/ceph/inode.c                    |  10 ++
 fs/ceph/mds_client.c               |  23 ++++
 fs/ceph/mds_client.h               |   2 +
 fs/ceph/quota.c                    | 276 +++++++++++++++++++++++++++++++++++++
 fs/ceph/super.h                    |  14 ++
 fs/ceph/xattr.c                    |  44 ++++++
 include/linux/ceph/ceph_features.h |   3 +-
 include/linux/ceph/ceph_fs.h       |  17 +++
 12 files changed, 437 insertions(+), 3 deletions(-)
 create mode 100644 fs/ceph/quota.c

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v4 1/6] ceph: quota: add initial infrastructure to support cephfs quotas
  2018-01-05 10:47 [PATCH v4 0/6] ceph: kernel client cephfs quota support Luis Henriques
@ 2018-01-05 10:47 ` Luis Henriques
  2018-01-05 10:47 ` [PATCH v4 2/6] ceph: quota: support for ceph.quota.max_files Luis Henriques
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Luis Henriques @ 2018-01-05 10:47 UTC (permalink / raw)
  To: ceph-devel; +Cc: Yan, Zheng, Jeff Layton, Jan Fajerski, Luis Henriques

This patch adds the infrastructure required to support cephfs quotas as it
is currently implemented in the ceph fuse client.  Cephfs quotas can be
set on any directory, and can restrict the number of bytes or the number
of files stored beneath that point in the directory hierarchy.

Quotas are set using the extended attributes 'ceph.quota.max_files' and
'ceph.quota.max_bytes', and can be removed by setting these attributes to
'0'.

Link: http://tracker.ceph.com/issues/22372
Signed-off-by: Luis Henriques <lhenriques@suse.com>
---
 fs/ceph/Makefile                   |  2 +-
 fs/ceph/inode.c                    |  6 ++++
 fs/ceph/mds_client.c               | 23 ++++++++++++++
 fs/ceph/mds_client.h               |  2 ++
 fs/ceph/quota.c                    | 65 ++++++++++++++++++++++++++++++++++++++
 fs/ceph/super.h                    |  8 +++++
 fs/ceph/xattr.c                    | 44 ++++++++++++++++++++++++++
 include/linux/ceph/ceph_features.h |  3 +-
 include/linux/ceph/ceph_fs.h       | 17 ++++++++++
 9 files changed, 168 insertions(+), 2 deletions(-)
 create mode 100644 fs/ceph/quota.c

diff --git a/fs/ceph/Makefile b/fs/ceph/Makefile
index 174f5709e508..a699e320393f 100644
--- a/fs/ceph/Makefile
+++ b/fs/ceph/Makefile
@@ -6,7 +6,7 @@
 obj-$(CONFIG_CEPH_FS) += ceph.o
 
 ceph-y := super.o inode.o dir.o file.o locks.o addr.o ioctl.o \
-	export.o caps.o snap.o xattr.o \
+	export.o caps.o snap.o xattr.o quota.o \
 	mds_client.o mdsmap.o strings.o ceph_frag.o \
 	debugfs.o
 
diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c
index ab81652198c4..8a0ba96e105d 100644
--- a/fs/ceph/inode.c
+++ b/fs/ceph/inode.c
@@ -441,6 +441,9 @@ struct inode *ceph_alloc_inode(struct super_block *sb)
 	atomic64_set(&ci->i_complete_seq[1], 0);
 	ci->i_symlink = NULL;
 
+	ci->i_max_bytes = 0;
+	ci->i_max_files = 0;
+
 	memset(&ci->i_dir_layout, 0, sizeof(ci->i_dir_layout));
 	RCU_INIT_POINTER(ci->i_layout.pool_ns, NULL);
 
@@ -790,6 +793,9 @@ static int fill_inode(struct inode *inode, struct page *locked_page,
 	inode->i_rdev = le32_to_cpu(info->rdev);
 	inode->i_blkbits = fls(le32_to_cpu(info->layout.fl_stripe_unit)) - 1;
 
+	ci->i_max_bytes = iinfo->max_bytes;
+	ci->i_max_files = iinfo->max_files;
+
 	if ((new_version || (new_issued & CEPH_CAP_AUTH_SHARED)) &&
 	    (issued & CEPH_CAP_AUTH_EXCL) == 0) {
 		inode->i_mode = le32_to_cpu(info->mode);
diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
index 1b468250e947..2290056d13fc 100644
--- a/fs/ceph/mds_client.c
+++ b/fs/ceph/mds_client.c
@@ -100,6 +100,26 @@ static int parse_reply_info_in(void **p, void *end,
 	} else
 		info->inline_version = CEPH_INLINE_NONE;
 
+	if (features & CEPH_FEATURE_MDS_QUOTA) {
+		u8 struct_v, struct_compat;
+		u32 struct_len;
+
+		/*
+		 * both struct_v and struct_compat are expected to be >= 1
+		 */
+		ceph_decode_8_safe(p, end, struct_v, bad);
+		ceph_decode_8_safe(p, end, struct_compat, bad);
+		if (!struct_v || !struct_compat)
+			goto bad;
+		ceph_decode_32_safe(p, end, struct_len, bad);
+		ceph_decode_need(p, end, struct_len, bad);
+		ceph_decode_64_safe(p, end, info->max_bytes, bad);
+		ceph_decode_64_safe(p, end, info->max_files, bad);
+	} else {
+		info->max_bytes = 0;
+		info->max_files = 0;
+	}
+
 	info->pool_ns_len = 0;
 	info->pool_ns_data = NULL;
 	if (features & CEPH_FEATURE_FS_FILE_LAYOUT_V2) {
@@ -4064,6 +4084,9 @@ static void dispatch(struct ceph_connection *con, struct ceph_msg *msg)
 	case CEPH_MSG_CLIENT_LEASE:
 		handle_lease(mdsc, s, msg);
 		break;
+	case CEPH_MSG_CLIENT_QUOTA:
+		ceph_handle_quota(mdsc, s, msg);
+		break;
 
 	default:
 		pr_err("received unknown message type %d %s\n", type,
diff --git a/fs/ceph/mds_client.h b/fs/ceph/mds_client.h
index 837ac4b087a0..7af576733948 100644
--- a/fs/ceph/mds_client.h
+++ b/fs/ceph/mds_client.h
@@ -49,6 +49,8 @@ struct ceph_mds_reply_info_in {
 	char *inline_data;
 	u32 pool_ns_len;
 	char *pool_ns_data;
+	u64 max_bytes;
+	u64 max_files;
 };
 
 struct ceph_mds_reply_dir_entry {
diff --git a/fs/ceph/quota.c b/fs/ceph/quota.c
new file mode 100644
index 000000000000..1b69d8365ec2
--- /dev/null
+++ b/fs/ceph/quota.c
@@ -0,0 +1,65 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * quota.c - CephFS quota
+ *
+ * Copyright (C) 2017-2018 SUSE
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "super.h"
+#include "mds_client.h"
+
+void ceph_handle_quota(struct ceph_mds_client *mdsc,
+		       struct ceph_mds_session *session,
+		       struct ceph_msg *msg)
+{
+	struct super_block *sb = mdsc->fsc->sb;
+	struct ceph_mds_quota *h = msg->front.iov_base;
+	struct ceph_vino vino;
+	struct inode *inode;
+	struct ceph_inode_info *ci;
+
+	if (msg->front.iov_len != sizeof(*h)) {
+		pr_err("%s corrupt message mds%d len %d\n", __func__,
+		       session->s_mds, (int)msg->front.iov_len);
+		ceph_msg_dump(msg);
+		return;
+	}
+
+	/* increment msg sequence number */
+	mutex_lock(&session->s_mutex);
+	session->s_seq++;
+	mutex_unlock(&session->s_mutex);
+
+	/* lookup inode */
+	vino.ino = le64_to_cpu(h->ino);
+	vino.snap = CEPH_NOSNAP;
+	inode = ceph_find_inode(sb, vino);
+	if (!inode) {
+		pr_warn("Failed to find inode %llu\n", vino.ino);
+		return;
+	}
+	ci = ceph_inode(inode);
+
+	spin_lock(&ci->i_ceph_lock);
+	ci->i_rbytes = le64_to_cpu(h->rbytes);
+	ci->i_rfiles = le64_to_cpu(h->rfiles);
+	ci->i_rsubdirs = le64_to_cpu(h->rsubdirs);
+	ci->i_max_bytes = le64_to_cpu(h->max_bytes);
+	ci->i_max_files = le64_to_cpu(h->max_files);
+	spin_unlock(&ci->i_ceph_lock);
+
+	iput(inode);
+}
diff --git a/fs/ceph/super.h b/fs/ceph/super.h
index 2beeec07fa76..f998b7f076cf 100644
--- a/fs/ceph/super.h
+++ b/fs/ceph/super.h
@@ -309,6 +309,9 @@ struct ceph_inode_info {
 	u64 i_rbytes, i_rfiles, i_rsubdirs;
 	u64 i_files, i_subdirs;
 
+	/* quotas */
+	u64 i_max_bytes, i_max_files;
+
 	struct rb_root i_fragtree;
 	int i_fragtree_nsplits;
 	struct mutex i_fragtree_mutex;
@@ -1019,4 +1022,9 @@ extern int ceph_locks_to_pagelist(struct ceph_filelock *flocks,
 extern int ceph_fs_debugfs_init(struct ceph_fs_client *client);
 extern void ceph_fs_debugfs_cleanup(struct ceph_fs_client *client);
 
+/* quota.c */
+extern void ceph_handle_quota(struct ceph_mds_client *mdsc,
+			      struct ceph_mds_session *session,
+			      struct ceph_msg *msg);
+
 #endif /* _FS_CEPH_SUPER_H */
diff --git a/fs/ceph/xattr.c b/fs/ceph/xattr.c
index e1c4e0b12b4c..7e72348639e4 100644
--- a/fs/ceph/xattr.c
+++ b/fs/ceph/xattr.c
@@ -224,6 +224,31 @@ static size_t ceph_vxattrcb_dir_rctime(struct ceph_inode_info *ci, char *val,
 			(long)ci->i_rctime.tv_nsec);
 }
 
+/* quotas */
+
+static bool ceph_vxattrcb_quota_exists(struct ceph_inode_info *ci)
+{
+	return (ci->i_max_files || ci->i_max_bytes);
+}
+
+static size_t ceph_vxattrcb_quota(struct ceph_inode_info *ci, char *val,
+				  size_t size)
+{
+	return snprintf(val, size, "max_bytes=%llu max_files=%llu",
+			ci->i_max_bytes, ci->i_max_files);
+}
+
+static size_t ceph_vxattrcb_quota_max_bytes(struct ceph_inode_info *ci,
+					    char *val, size_t size)
+{
+	return snprintf(val, size, "%llu", ci->i_max_bytes);
+}
+
+static size_t ceph_vxattrcb_quota_max_files(struct ceph_inode_info *ci,
+					    char *val, size_t size)
+{
+	return snprintf(val, size, "%llu", ci->i_max_files);
+}
 
 #define CEPH_XATTR_NAME(_type, _name)	XATTR_CEPH_PREFIX #_type "." #_name
 #define CEPH_XATTR_NAME2(_type, _name, _name2)	\
@@ -247,6 +272,15 @@ static size_t ceph_vxattrcb_dir_rctime(struct ceph_inode_info *ci, char *val,
 		.hidden = true,			\
 		.exists_cb = ceph_vxattrcb_layout_exists,	\
 	}
+#define XATTR_QUOTA_FIELD(_type, _name)					\
+	{								\
+		.name = CEPH_XATTR_NAME(_type, _name),			\
+		.name_size = sizeof(CEPH_XATTR_NAME(_type, _name)),	\
+		.getxattr_cb = ceph_vxattrcb_ ## _type ## _ ## _name,	\
+		.readonly = false,					\
+		.hidden = true,						\
+		.exists_cb = ceph_vxattrcb_quota_exists,		\
+	}
 
 static struct ceph_vxattr ceph_dir_vxattrs[] = {
 	{
@@ -270,6 +304,16 @@ static struct ceph_vxattr ceph_dir_vxattrs[] = {
 	XATTR_NAME_CEPH(dir, rsubdirs),
 	XATTR_NAME_CEPH(dir, rbytes),
 	XATTR_NAME_CEPH(dir, rctime),
+	{
+		.name = "ceph.quota",
+		.name_size = sizeof("ceph.quota"),
+		.getxattr_cb = ceph_vxattrcb_quota,
+		.readonly = false,
+		.hidden = true,
+		.exists_cb = ceph_vxattrcb_quota_exists,
+	},
+	XATTR_QUOTA_FIELD(quota, max_bytes),
+	XATTR_QUOTA_FIELD(quota, max_files),
 	{ .name = NULL, 0 }	/* Required table terminator */
 };
 static size_t ceph_dir_vxattrs_name_size;	/* total size of all names */
diff --git a/include/linux/ceph/ceph_features.h b/include/linux/ceph/ceph_features.h
index 59042d5ac520..6acd46c36271 100644
--- a/include/linux/ceph/ceph_features.h
+++ b/include/linux/ceph/ceph_features.h
@@ -209,7 +209,8 @@ DEFINE_CEPH_FEATURE_DEPRECATED(63, 1, RESERVED_BROKEN, LUMINOUS) // client-facin
 	 CEPH_FEATURE_SERVER_JEWEL |		\
 	 CEPH_FEATURE_MON_STATEFUL_SUB |	\
 	 CEPH_FEATURE_CRUSH_TUNABLES5 |		\
-	 CEPH_FEATURE_NEW_OSDOPREPLY_ENCODING)
+	 CEPH_FEATURE_NEW_OSDOPREPLY_ENCODING |	\
+	 CEPH_FEATURE_MDS_QUOTA)
 
 #define CEPH_FEATURES_REQUIRED_DEFAULT   \
 	(CEPH_FEATURE_NOSRCADDR |	 \
diff --git a/include/linux/ceph/ceph_fs.h b/include/linux/ceph/ceph_fs.h
index 88dd51381aaf..98bdcc0eda3f 100644
--- a/include/linux/ceph/ceph_fs.h
+++ b/include/linux/ceph/ceph_fs.h
@@ -134,6 +134,7 @@ struct ceph_dir_layout {
 #define CEPH_MSG_CLIENT_LEASE           0x311
 #define CEPH_MSG_CLIENT_SNAP            0x312
 #define CEPH_MSG_CLIENT_CAPRELEASE      0x313
+#define CEPH_MSG_CLIENT_QUOTA		0x314
 
 /* pool ops */
 #define CEPH_MSG_POOLOP_REPLY           48
@@ -807,4 +808,20 @@ struct ceph_mds_snap_realm {
 } __attribute__ ((packed));
 /* followed by my snap list, then prior parent snap list */
 
+/*
+ * quotas
+ */
+struct ceph_mds_quota {
+	__le64 ino;		/* ino */
+	struct ceph_timespec rctime;
+	__le64 rbytes;		/* dir stats */
+	__le64 rfiles;
+	__le64 rsubdirs;
+	__u8 struct_v;		/* compat */
+	__u8 struct_compat;
+	__le32 struct_len;
+	__le64 max_bytes;	/* quota max. bytes */
+	__le64 max_files;	/* quota max. files */
+} __attribute__ ((packed));
+
 #endif

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v4 2/6] ceph: quota: support for ceph.quota.max_files
  2018-01-05 10:47 [PATCH v4 0/6] ceph: kernel client cephfs quota support Luis Henriques
  2018-01-05 10:47 ` [PATCH v4 1/6] ceph: quota: add initial infrastructure to support cephfs quotas Luis Henriques
@ 2018-01-05 10:47 ` Luis Henriques
  2018-01-05 10:47 ` [PATCH v4 3/6] ceph: quota: don't allow cross-quota renames Luis Henriques
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Luis Henriques @ 2018-01-05 10:47 UTC (permalink / raw)
  To: ceph-devel; +Cc: Yan, Zheng, Jeff Layton, Jan Fajerski, Luis Henriques

This patch adds support for the max_files quota.  It hooks into all the
ceph functions that add new filesystem objects that need to be checked
against the quota limits.  When these limits are hit, -EDQUOT is returned.

Note that we're not checking quotas on ceph_link().  ceph_link doesn't
really create a new inode,  and since the MDS doesn't update the directory
statistics when a new (hard) link is created (only with symlinks), they
are not accounted as a new file.

Link: http://tracker.ceph.com/issues/22372
Signed-off-by: Luis Henriques <lhenriques@suse.com>
---
 fs/ceph/dir.c   | 11 ++++++++
 fs/ceph/file.c  |  4 ++-
 fs/ceph/quota.c | 80 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 fs/ceph/super.h |  1 +
 4 files changed, 95 insertions(+), 1 deletion(-)

diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c
index 8a5266699b67..66550d92b1ac 100644
--- a/fs/ceph/dir.c
+++ b/fs/ceph/dir.c
@@ -818,6 +818,9 @@ static int ceph_mknod(struct inode *dir, struct dentry *dentry,
 	if (ceph_snap(dir) != CEPH_NOSNAP)
 		return -EROFS;
 
+	if (ceph_quota_is_max_files_exceeded(dir))
+		return -EDQUOT;
+
 	err = ceph_pre_init_acls(dir, &mode, &acls);
 	if (err < 0)
 		return err;
@@ -871,6 +874,9 @@ static int ceph_symlink(struct inode *dir, struct dentry *dentry,
 	if (ceph_snap(dir) != CEPH_NOSNAP)
 		return -EROFS;
 
+	if (ceph_quota_is_max_files_exceeded(dir))
+		return -EDQUOT;
+
 	dout("symlink in dir %p dentry %p to '%s'\n", dir, dentry, dest);
 	req = ceph_mdsc_create_request(mdsc, CEPH_MDS_OP_SYMLINK, USE_AUTH_MDS);
 	if (IS_ERR(req)) {
@@ -920,6 +926,11 @@ static int ceph_mkdir(struct inode *dir, struct dentry *dentry, umode_t mode)
 		goto out;
 	}
 
+	if (ceph_quota_is_max_files_exceeded(dir)) {
+		err = -EDQUOT;
+		goto out;
+	}
+
 	mode |= S_IFDIR;
 	err = ceph_pre_init_acls(dir, &mode, &acls);
 	if (err < 0)
diff --git a/fs/ceph/file.c b/fs/ceph/file.c
index 5c17125f45c7..5a77a66e3d6b 100644
--- a/fs/ceph/file.c
+++ b/fs/ceph/file.c
@@ -371,7 +371,7 @@ int ceph_atomic_open(struct inode *dir, struct dentry *dentry,
 	struct ceph_mds_request *req;
 	struct dentry *dn;
 	struct ceph_acls_info acls = {};
-       int mask;
+	int mask;
 	int err;
 
 	dout("atomic_open %p dentry %p '%pd' %s flags %d mode 0%o\n",
@@ -382,6 +382,8 @@ int ceph_atomic_open(struct inode *dir, struct dentry *dentry,
 		return -ENAMETOOLONG;
 
 	if (flags & O_CREAT) {
+		if (ceph_quota_is_max_files_exceeded(dir))
+			return -EDQUOT;
 		err = ceph_pre_init_acls(dir, &mode, &acls);
 		if (err < 0)
 			return err;
diff --git a/fs/ceph/quota.c b/fs/ceph/quota.c
index 1b69d8365ec2..cf1c78c4a4d2 100644
--- a/fs/ceph/quota.c
+++ b/fs/ceph/quota.c
@@ -63,3 +63,83 @@ void ceph_handle_quota(struct ceph_mds_client *mdsc,
 
 	iput(inode);
 }
+
+enum quota_check_op {
+	QUOTA_CHECK_MAX_FILES_OP	/* check quota max_files limit */
+};
+
+/*
+ * check_quota_exceeded() will walk up the snaprealm hierarchy and, for each
+ * realm, it will execute quota check operation defined by the 'op' parameter.
+ * The snaprealm walk is interrupted if the quota check detects that the quota
+ * is exceeded or if the root inode is reached.
+ */
+static bool check_quota_exceeded(struct inode *inode, enum quota_check_op op,
+				 loff_t delta)
+{
+	struct ceph_mds_client *mdsc = ceph_inode_to_client(inode)->mdsc;
+	struct ceph_inode_info *ci;
+	struct ceph_snap_realm *realm, *next;
+	struct ceph_vino vino;
+	struct inode *in;
+	u64 max, rvalue;
+	bool is_root;
+	bool exceeded = false;
+
+	down_read(&mdsc->snap_rwsem);
+	realm = ceph_inode(inode)->i_snap_realm;
+	ceph_get_snap_realm(mdsc, realm);
+	while (realm) {
+		vino.ino = realm->ino;
+		vino.snap = CEPH_NOSNAP;
+		in = ceph_find_inode(inode->i_sb, vino);
+		if (!in) {
+			pr_warn("Failed to find inode for %llu\n", vino.ino);
+			break;
+		}
+		ci = ceph_inode(in);
+		spin_lock(&ci->i_ceph_lock);
+		if (op == QUOTA_CHECK_MAX_FILES_OP) {
+			max = ci->i_max_files;
+			rvalue = ci->i_rfiles + ci->i_rsubdirs;
+		}
+		is_root = (ci->i_vino.ino == CEPH_INO_ROOT);
+		spin_unlock(&ci->i_ceph_lock);
+		switch (op) {
+		case QUOTA_CHECK_MAX_FILES_OP:
+			exceeded = (max && (rvalue >= max));
+			break;
+		default:
+			/* Shouldn't happen */
+			pr_warn("Invalid quota check op (%d)\n", op);
+			exceeded = true; /* Just break the loop */
+		}
+		iput(in);
+
+		if (is_root || exceeded)
+			break;
+		next = realm->parent;
+		ceph_get_snap_realm(mdsc, next);
+		ceph_put_snap_realm(mdsc, realm);
+		realm = next;
+	}
+	ceph_put_snap_realm(mdsc, realm);
+	up_read(&mdsc->snap_rwsem);
+
+	return exceeded;
+}
+
+/*
+ * ceph_quota_is_max_files_exceeded - check if we can create a new file
+ * @inode:	directory where a new file is being created
+ *
+ * This functions returns true is max_files quota allows a new file to be
+ * created.  It is necessary to walk through the snaprealm hierarchy (until the
+ * FS root) to check all realms with quotas set.
+ */
+bool ceph_quota_is_max_files_exceeded(struct inode *inode)
+{
+	WARN_ON(!S_ISDIR(inode->i_mode));
+
+	return check_quota_exceeded(inode, QUOTA_CHECK_MAX_FILES_OP, 0);
+}
diff --git a/fs/ceph/super.h b/fs/ceph/super.h
index f998b7f076cf..20197e29a7f0 100644
--- a/fs/ceph/super.h
+++ b/fs/ceph/super.h
@@ -1026,5 +1026,6 @@ extern void ceph_fs_debugfs_cleanup(struct ceph_fs_client *client);
 extern void ceph_handle_quota(struct ceph_mds_client *mdsc,
 			      struct ceph_mds_session *session,
 			      struct ceph_msg *msg);
+extern bool ceph_quota_is_max_files_exceeded(struct inode *inode);
 
 #endif /* _FS_CEPH_SUPER_H */

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v4 3/6] ceph: quota: don't allow cross-quota renames
  2018-01-05 10:47 [PATCH v4 0/6] ceph: kernel client cephfs quota support Luis Henriques
  2018-01-05 10:47 ` [PATCH v4 1/6] ceph: quota: add initial infrastructure to support cephfs quotas Luis Henriques
  2018-01-05 10:47 ` [PATCH v4 2/6] ceph: quota: support for ceph.quota.max_files Luis Henriques
@ 2018-01-05 10:47 ` Luis Henriques
  2018-01-05 10:47 ` [PATCH v4 4/6] ceph: quota: support for ceph.quota.max_bytes Luis Henriques
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Luis Henriques @ 2018-01-05 10:47 UTC (permalink / raw)
  To: ceph-devel; +Cc: Yan, Zheng, Jeff Layton, Jan Fajerski, Luis Henriques

This patch changes ceph_rename so that -EXDEV is returned if an attempt is
made to mv a file between two different dir trees with different quotas
setup.

Link: http://tracker.ceph.com/issues/22372
Signed-off-by: Luis Henriques <lhenriques@suse.com>
---
 fs/ceph/dir.c   |  5 +++++
 fs/ceph/quota.c | 69 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 fs/ceph/super.h |  1 +
 3 files changed, 75 insertions(+)

diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c
index 66550d92b1ac..f6ac16caa1e9 100644
--- a/fs/ceph/dir.c
+++ b/fs/ceph/dir.c
@@ -1090,6 +1090,11 @@ static int ceph_rename(struct inode *old_dir, struct dentry *old_dentry,
 		else
 			return -EROFS;
 	}
+	/* don't allow cross-quota renames */
+	if ((old_dir != new_dir) &&
+	    (!ceph_quota_is_same_realm(old_dir, new_dir)))
+		return -EXDEV;
+
 	dout("rename dir %p dentry %p to dir %p dentry %p\n",
 	     old_dir, old_dentry, new_dir, new_dentry);
 	req = ceph_mdsc_create_request(mdsc, op, USE_AUTH_MDS);
diff --git a/fs/ceph/quota.c b/fs/ceph/quota.c
index cf1c78c4a4d2..5d7dada91a57 100644
--- a/fs/ceph/quota.c
+++ b/fs/ceph/quota.c
@@ -21,6 +21,11 @@
 #include "super.h"
 #include "mds_client.h"
 
+static inline bool ceph_has_quota(struct ceph_inode_info *ci)
+{
+	return (ci && (ci->i_max_files || ci->i_max_bytes));
+}
+
 void ceph_handle_quota(struct ceph_mds_client *mdsc,
 		       struct ceph_mds_session *session,
 		       struct ceph_msg *msg)
@@ -64,6 +69,70 @@ void ceph_handle_quota(struct ceph_mds_client *mdsc,
 	iput(inode);
 }
 
+/*
+ * This function walks through the snaprealm for an inode and returns the
+ * ceph_snap_realm for the first snaprealm that has quotas set (either max_files
+ * or max_bytes).  If the root is reached, return the root ceph_snap_realm
+ * instead.
+ *
+ * Note that the caller is responsible for calling ceph_put_snap_realm() on the
+ * returned realm.
+ */
+static struct ceph_snap_realm *get_quota_realm(struct ceph_mds_client *mdsc,
+					       struct inode *inode)
+{
+	struct ceph_inode_info *ci = NULL;
+	struct ceph_snap_realm *realm, *next;
+	struct ceph_vino vino;
+	struct inode *in;
+
+	realm = ceph_inode(inode)->i_snap_realm;
+	ceph_get_snap_realm(mdsc, realm);
+	while (realm) {
+		vino.ino = realm->ino;
+		vino.snap = CEPH_NOSNAP;
+		in = ceph_find_inode(inode->i_sb, vino);
+		if (!in) {
+			pr_warn("Failed to find inode for %llu\n", vino.ino);
+			break;
+		}
+		ci = ceph_inode(in);
+		if (ceph_has_quota(ci) || (ci->i_vino.ino == CEPH_INO_ROOT)) {
+			iput(in);
+			return realm;
+		}
+		iput(in);
+		next = realm->parent;
+		ceph_get_snap_realm(mdsc, next);
+		ceph_put_snap_realm(mdsc, realm);
+		realm = next;
+	}
+	if (realm)
+		ceph_put_snap_realm(mdsc, realm);
+
+	return NULL;
+}
+
+bool ceph_quota_is_same_realm(struct inode *old, struct inode *new)
+{
+	struct ceph_mds_client *mdsc = ceph_inode_to_client(old)->mdsc;
+	struct ceph_snap_realm *old_realm, *new_realm;
+	bool is_same;
+
+	down_read(&mdsc->snap_rwsem);
+	old_realm = get_quota_realm(mdsc, old);
+	new_realm = get_quota_realm(mdsc, new);
+	is_same = (old_realm == new_realm);
+	up_read(&mdsc->snap_rwsem);
+
+	if (old_realm)
+		ceph_put_snap_realm(mdsc, old_realm);
+	if (new_realm)
+		ceph_put_snap_realm(mdsc, new_realm);
+
+	return is_same;
+}
+
 enum quota_check_op {
 	QUOTA_CHECK_MAX_FILES_OP	/* check quota max_files limit */
 };
diff --git a/fs/ceph/super.h b/fs/ceph/super.h
index 20197e29a7f0..a66e73338386 100644
--- a/fs/ceph/super.h
+++ b/fs/ceph/super.h
@@ -1027,5 +1027,6 @@ extern void ceph_handle_quota(struct ceph_mds_client *mdsc,
 			      struct ceph_mds_session *session,
 			      struct ceph_msg *msg);
 extern bool ceph_quota_is_max_files_exceeded(struct inode *inode);
+extern bool ceph_quota_is_same_realm(struct inode *old, struct inode *new);
 
 #endif /* _FS_CEPH_SUPER_H */

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v4 4/6] ceph: quota: support for ceph.quota.max_bytes
  2018-01-05 10:47 [PATCH v4 0/6] ceph: kernel client cephfs quota support Luis Henriques
                   ` (2 preceding siblings ...)
  2018-01-05 10:47 ` [PATCH v4 3/6] ceph: quota: don't allow cross-quota renames Luis Henriques
@ 2018-01-05 10:47 ` Luis Henriques
  2018-01-05 10:47 ` [PATCH v4 5/6] ceph: quota: update MDS when max_bytes is approaching Luis Henriques
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Luis Henriques @ 2018-01-05 10:47 UTC (permalink / raw)
  To: ceph-devel; +Cc: Yan, Zheng, Jeff Layton, Jan Fajerski, Luis Henriques

Signed-off-by: Luis Henriques <lhenriques@suse.com>
---
 fs/ceph/file.c  | 11 +++++++++++
 fs/ceph/inode.c |  4 ++++
 fs/ceph/quota.c | 28 +++++++++++++++++++++++++++-
 fs/ceph/super.h |  2 ++
 4 files changed, 44 insertions(+), 1 deletion(-)

diff --git a/fs/ceph/file.c b/fs/ceph/file.c
index 5a77a66e3d6b..762402d323a6 100644
--- a/fs/ceph/file.c
+++ b/fs/ceph/file.c
@@ -1331,6 +1331,11 @@ static ssize_t ceph_write_iter(struct kiocb *iocb, struct iov_iter *from)
 
 	pos = iocb->ki_pos;
 	count = iov_iter_count(from);
+	if (ceph_quota_is_max_bytes_exceeded(inode, pos + count)) {
+		err = -EDQUOT;
+		goto out;
+	}
+
 	err = file_remove_privs(file);
 	if (err)
 		goto out;
@@ -1661,6 +1666,12 @@ static long ceph_fallocate(struct file *file, int mode,
 		goto unlock;
 	}
 
+	if (!(mode & (FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE)) &&
+	    ceph_quota_is_max_bytes_exceeded(inode, offset + length)) {
+		ret = -EDQUOT;
+		goto unlock;
+	}
+
 	if (ceph_osdmap_flag(osdc, CEPH_OSDMAP_FULL) &&
 	    !(mode & FALLOC_FL_PUNCH_HOLE)) {
 		ret = -ENOSPC;
diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c
index 8a0ba96e105d..342ba26f6232 100644
--- a/fs/ceph/inode.c
+++ b/fs/ceph/inode.c
@@ -2130,6 +2130,10 @@ int ceph_setattr(struct dentry *dentry, struct iattr *attr)
 	if (err != 0)
 		return err;
 
+	if ((attr->ia_valid & ATTR_SIZE) &&
+	    ceph_quota_is_max_bytes_exceeded(inode, attr->ia_size))
+		return -EDQUOT;
+
 	err = __ceph_setattr(inode, attr);
 
 	if (err >= 0 && (attr->ia_valid & ATTR_MODE))
diff --git a/fs/ceph/quota.c b/fs/ceph/quota.c
index 5d7dada91a57..745f9f47027b 100644
--- a/fs/ceph/quota.c
+++ b/fs/ceph/quota.c
@@ -134,7 +134,8 @@ bool ceph_quota_is_same_realm(struct inode *old, struct inode *new)
 }
 
 enum quota_check_op {
-	QUOTA_CHECK_MAX_FILES_OP	/* check quota max_files limit */
+	QUOTA_CHECK_MAX_FILES_OP,	/* check quota max_files limit */
+	QUOTA_CHECK_MAX_BYTES_OP	/* check quota max_files limit */
 };
 
 /*
@@ -171,6 +172,9 @@ static bool check_quota_exceeded(struct inode *inode, enum quota_check_op op,
 		if (op == QUOTA_CHECK_MAX_FILES_OP) {
 			max = ci->i_max_files;
 			rvalue = ci->i_rfiles + ci->i_rsubdirs;
+		} else {
+			max = ci->i_max_bytes;
+			rvalue = ci->i_rbytes;
 		}
 		is_root = (ci->i_vino.ino == CEPH_INO_ROOT);
 		spin_unlock(&ci->i_ceph_lock);
@@ -178,6 +182,9 @@ static bool check_quota_exceeded(struct inode *inode, enum quota_check_op op,
 		case QUOTA_CHECK_MAX_FILES_OP:
 			exceeded = (max && (rvalue >= max));
 			break;
+		case QUOTA_CHECK_MAX_BYTES_OP:
+			exceeded = (max && (rvalue + delta > max));
+			break;
 		default:
 			/* Shouldn't happen */
 			pr_warn("Invalid quota check op (%d)\n", op);
@@ -212,3 +219,22 @@ bool ceph_quota_is_max_files_exceeded(struct inode *inode)
 
 	return check_quota_exceeded(inode, QUOTA_CHECK_MAX_FILES_OP, 0);
 }
+
+/*
+ * ceph_quota_is_max_bytes_exceeded - check if we can write to a file
+ * @inode:	inode being written
+ * @newsize:	new size if write succeeds
+ *
+ * This functions returns true is max_bytes quota allows a file size to reach
+ * @newsize; it returns false otherwise.
+ */
+bool ceph_quota_is_max_bytes_exceeded(struct inode *inode, loff_t newsize)
+{
+	loff_t size = i_size_read(inode);
+
+	/* return immediately if we're decreasing file size */
+	if (newsize <= size)
+		return false;
+
+	return check_quota_exceeded(inode, QUOTA_CHECK_MAX_BYTES_OP, (newsize - size));
+}
diff --git a/fs/ceph/super.h b/fs/ceph/super.h
index a66e73338386..60ace9ce5a94 100644
--- a/fs/ceph/super.h
+++ b/fs/ceph/super.h
@@ -1028,5 +1028,7 @@ extern void ceph_handle_quota(struct ceph_mds_client *mdsc,
 			      struct ceph_msg *msg);
 extern bool ceph_quota_is_max_files_exceeded(struct inode *inode);
 extern bool ceph_quota_is_same_realm(struct inode *old, struct inode *new);
+extern bool ceph_quota_is_max_bytes_exceeded(struct inode *inode,
+					     loff_t newlen);
 
 #endif /* _FS_CEPH_SUPER_H */

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v4 5/6] ceph: quota: update MDS when max_bytes is approaching
  2018-01-05 10:47 [PATCH v4 0/6] ceph: kernel client cephfs quota support Luis Henriques
                   ` (3 preceding siblings ...)
  2018-01-05 10:47 ` [PATCH v4 4/6] ceph: quota: support for ceph.quota.max_bytes Luis Henriques
@ 2018-01-05 10:47 ` Luis Henriques
  2018-01-05 10:47 ` [PATCH v4 6/6] ceph: quota: add quotas to the in-tree cephfs documentation Luis Henriques
  2018-01-08 12:55 ` [PATCH v4 0/6] ceph: kernel client cephfs quota support Yan, Zheng
  6 siblings, 0 replies; 11+ messages in thread
From: Luis Henriques @ 2018-01-05 10:47 UTC (permalink / raw)
  To: ceph-devel; +Cc: Yan, Zheng, Jeff Layton, Jan Fajerski, Luis Henriques

When we're reaching the ceph.quota.max_bytes limit, i.e., when writing
more than 1/16th of the space left in a quota realm, update the MDS with
the new file size.

This mirrors the fuse-client approach with commit 122c50315ed1 ("client:
Inform mds file size when approaching quota limit"), in the ceph git tree.

Signed-off-by: Luis Henriques <lhenriques@suse.com>
---
 fs/ceph/file.c  |  6 ++++++
 fs/ceph/quota.c | 38 +++++++++++++++++++++++++++++++++++++-
 fs/ceph/super.h |  2 ++
 3 files changed, 45 insertions(+), 1 deletion(-)

diff --git a/fs/ceph/file.c b/fs/ceph/file.c
index 762402d323a6..4a8fe87edfa3 100644
--- a/fs/ceph/file.c
+++ b/fs/ceph/file.c
@@ -1417,6 +1417,7 @@ static ssize_t ceph_write_iter(struct kiocb *iocb, struct iov_iter *from)
 
 	if (written >= 0) {
 		int dirty;
+
 		spin_lock(&ci->i_ceph_lock);
 		ci->i_inline_version = CEPH_INLINE_NONE;
 		dirty = __ceph_mark_dirty_caps(ci, CEPH_CAP_FILE_WR,
@@ -1424,6 +1425,8 @@ static ssize_t ceph_write_iter(struct kiocb *iocb, struct iov_iter *from)
 		spin_unlock(&ci->i_ceph_lock);
 		if (dirty)
 			__mark_inode_dirty(inode, dirty);
+		if (ceph_quota_is_max_bytes_approaching(inode, iocb->ki_pos))
+			ceph_check_caps(ci, CHECK_CAPS_NODELAY, NULL);
 	}
 
 	dout("aio_write %p %llx.%llx %llu~%u  dropping cap refs on %s\n",
@@ -1720,6 +1723,9 @@ static long ceph_fallocate(struct file *file, int mode,
 		spin_unlock(&ci->i_ceph_lock);
 		if (dirty)
 			__mark_inode_dirty(inode, dirty);
+		if ((endoff > size) &&
+		    ceph_quota_is_max_bytes_approaching(inode, endoff))
+			ceph_check_caps(ci, CHECK_CAPS_NODELAY, NULL);
 	}
 
 	ceph_put_cap_refs(ci, got);
diff --git a/fs/ceph/quota.c b/fs/ceph/quota.c
index 745f9f47027b..caa79fda4c5d 100644
--- a/fs/ceph/quota.c
+++ b/fs/ceph/quota.c
@@ -135,7 +135,9 @@ bool ceph_quota_is_same_realm(struct inode *old, struct inode *new)
 
 enum quota_check_op {
 	QUOTA_CHECK_MAX_FILES_OP,	/* check quota max_files limit */
-	QUOTA_CHECK_MAX_BYTES_OP	/* check quota max_files limit */
+	QUOTA_CHECK_MAX_BYTES_OP,	/* check quota max_files limit */
+	QUOTA_CHECK_MAX_BYTES_APPROACHING_OP	/* check if quota max_files
+						   limit is approaching */
 };
 
 /*
@@ -185,6 +187,20 @@ static bool check_quota_exceeded(struct inode *inode, enum quota_check_op op,
 		case QUOTA_CHECK_MAX_BYTES_OP:
 			exceeded = (max && (rvalue + delta > max));
 			break;
+		case QUOTA_CHECK_MAX_BYTES_APPROACHING_OP:
+			if (max) {
+				if (rvalue >= max)
+					exceeded = true;
+				else {
+					/*
+					 * when we're writing more that 1/16th
+					 * of the available space
+					 */
+					exceeded =
+						(((max - rvalue) >> 4) < delta);
+				}
+			}
+			break;
 		default:
 			/* Shouldn't happen */
 			pr_warn("Invalid quota check op (%d)\n", op);
@@ -238,3 +254,23 @@ bool ceph_quota_is_max_bytes_exceeded(struct inode *inode, loff_t newsize)
 
 	return check_quota_exceeded(inode, QUOTA_CHECK_MAX_BYTES_OP, (newsize - size));
 }
+
+/*
+ * ceph_quota_is_max_bytes_approaching - check if we're reaching max_bytes
+ * @inode:	inode being written
+ * @newsize:	new size if write succeeds
+ *
+ * This function returns true if the new file size @newsize will be consuming
+ * more than 1/16th of the available quota space; it returns false otherwise.
+ */
+bool ceph_quota_is_max_bytes_approaching(struct inode *inode, loff_t newsize)
+{
+	loff_t size = ceph_inode(inode)->i_reported_size;
+
+	/* return immediately if we're decreasing file size */
+	if (newsize <= size)
+		return false;
+
+	return check_quota_exceeded(inode, QUOTA_CHECK_MAX_BYTES_APPROACHING_OP,
+				    (newsize - size));
+}
diff --git a/fs/ceph/super.h b/fs/ceph/super.h
index 60ace9ce5a94..13623dac0e1b 100644
--- a/fs/ceph/super.h
+++ b/fs/ceph/super.h
@@ -1030,5 +1030,7 @@ extern bool ceph_quota_is_max_files_exceeded(struct inode *inode);
 extern bool ceph_quota_is_same_realm(struct inode *old, struct inode *new);
 extern bool ceph_quota_is_max_bytes_exceeded(struct inode *inode,
 					     loff_t newlen);
+extern bool ceph_quota_is_max_bytes_approaching(struct inode *inode,
+						loff_t newlen);
 
 #endif /* _FS_CEPH_SUPER_H */

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v4 6/6] ceph: quota: add quotas to the in-tree cephfs documentation
  2018-01-05 10:47 [PATCH v4 0/6] ceph: kernel client cephfs quota support Luis Henriques
                   ` (4 preceding siblings ...)
  2018-01-05 10:47 ` [PATCH v4 5/6] ceph: quota: update MDS when max_bytes is approaching Luis Henriques
@ 2018-01-05 10:47 ` Luis Henriques
  2018-01-08 12:55 ` [PATCH v4 0/6] ceph: kernel client cephfs quota support Yan, Zheng
  6 siblings, 0 replies; 11+ messages in thread
From: Luis Henriques @ 2018-01-05 10:47 UTC (permalink / raw)
  To: ceph-devel; +Cc: Yan, Zheng, Jeff Layton, Jan Fajerski, Luis Henriques

Update documentation to include a high-level description of ceph quotas.

Signed-off-by: Luis Henriques <lhenriques@suse.com>
---
 Documentation/filesystems/ceph.txt | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/Documentation/filesystems/ceph.txt b/Documentation/filesystems/ceph.txt
index 0b302a11718a..094772481263 100644
--- a/Documentation/filesystems/ceph.txt
+++ b/Documentation/filesystems/ceph.txt
@@ -62,6 +62,18 @@ subdirectories, and a summation of all nested file sizes.  This makes
 the identification of large disk space consumers relatively quick, as
 no 'du' or similar recursive scan of the file system is required.
 
+Finally, Ceph also allows quotas to be set on any directory in the system.
+The quota can restrict the number of bytes or the number of files stored
+beneath that point in the directory hierarchy.  Quotas can be set using
+extended attributes 'ceph.quota.max_files' and 'ceph.quota.max_bytes', eg:
+
+ setfattr -n ceph.quota.max_bytes -v 100000000 /some/dir
+ getfattr -n ceph.quota.max_bytes /some/dir
+
+A limitation of the current quotas implementation is that it relies on the
+cooperation of the client mounting the file system to stop writers when a
+limit is reached.  A modified or adversarial client cannot be prevented
+from writing as much data as it needs.
 
 Mount Syntax
 ============

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 0/6] ceph: kernel client cephfs quota support
  2018-01-05 10:47 [PATCH v4 0/6] ceph: kernel client cephfs quota support Luis Henriques
                   ` (5 preceding siblings ...)
  2018-01-05 10:47 ` [PATCH v4 6/6] ceph: quota: add quotas to the in-tree cephfs documentation Luis Henriques
@ 2018-01-08 12:55 ` Yan, Zheng
  2018-01-08 15:00   ` Luis Henriques
  2018-01-09 17:24   ` Luis Henriques
  6 siblings, 2 replies; 11+ messages in thread
From: Yan, Zheng @ 2018-01-08 12:55 UTC (permalink / raw)
  To: Luis Henriques; +Cc: ceph-devel, Yan, Zheng, Jeff Layton, Jan Fajerski

On Fri, Jan 5, 2018 at 6:47 PM, Luis Henriques <lhenriques@suse.com> wrote:
> A cephfs-specific quota implementation has been available in the
> user-space fuse client for a while.  This quota implementation allows an
> administrator to restrict the number of bytes and/or the number of files
> in a filesystem subtree.  This quota implementation, however, is
> supported at the client-level only, which means that cooperation is
> required between different clients accessing the system.
>
> This obviously assumes that all clients are trusted entities and will
> respect the quotas, preventing users from exceeding the quota limits.
> Since the kernel client doesn't support quotas, it has not been possible
> to use it in a cluster where quotas are a requirement.
>
> This patchset adds kernel client support for cephfs quotas as it is
> currently implemented in the ceph fuse client.  Note however that it
> relies on some still-to-be-merged changes to the MDS (see below,
> "Changes since v1" for details).
>
> For further details on CephFS quota, see [1].
>
> [1] http://docs.ceph.com/docs/master/cephfs/quota/
>
> ** Changes since v3 **
>
> - Rework after review from Yan, Zheng:
>   * ceph_handle_quota(): Always increment message sequence number, even
>     if inode isn't in cache
>   * renamed inode variables ino -> in
>   * get_quota_realm() now returns a ceph_snap_realm instead of
>     ceph_inode_info
>
> - Updated quota.c copyright and added SPDX identifier
>
> - Added max_bytes quota implementation
>
> - Updated Documentation/filesystems/ceph.txt to include reference to
>   quota; also documented added a few more comments to the code.
>
> ** Changes since v2 **
>
> Rework after review from Yan, Zheng:
>
> - Dropped patch 0001 ("ceph: add seqlock for snaprealm hierarchy change
>   detection") and use mdsc->snap_rwsem for walking the snaprealm
>   hierarchy instead of adding a seqlock.  This means that patches 0003
>   and 0004 needed to be reworked.
>
> - Added a NULL check in ceph_handle_quota() after the inode lookup with
>   ceph_find_inode().
>
> ** Changes since v1 **
>
> Instead of trying to do a reverse path walk to find the "quota realm"
> for a given directory, this patchset is now using snaprealms.  Thus, for
> testing it, a modified MDS is required:
>
>   https://github.com/ukernel/ceph/tree/wip-cephfs-quota-realm
>
> This modified MDS creates a snaprealm when a quota is set in a
> directory.  This means that a client needs only to walk up the snaprealm
> hierarchy to find a directory that has quotas instead of doing the full
> reverse path walking.
>
> Note however that this requires an extra patch that adds a seqlock (1st
> patch in series) to detect changes in the snaprealm hierarchy.
>
> Luis Henriques (6):
>   ceph: quota: add initial infrastructure to support cephfs quotas
>   ceph: quota: support for ceph.quota.max_files
>   ceph: quota: don't allow cross-quota renames
>   ceph: quota: support for ceph.quota.max_bytes
>   ceph: quota: update MDS when max_bytes is approaching
>   ceph: quota: add quotas to the in-tree cephfs documentation
>
>  Documentation/filesystems/ceph.txt |  12 ++
>  fs/ceph/Makefile                   |   2 +-
>  fs/ceph/dir.c                      |  16 +++
>  fs/ceph/file.c                     |  21 ++-
>  fs/ceph/inode.c                    |  10 ++
>  fs/ceph/mds_client.c               |  23 ++++
>  fs/ceph/mds_client.h               |   2 +
>  fs/ceph/quota.c                    | 276 +++++++++++++++++++++++++++++++++++++
>  fs/ceph/super.h                    |  14 ++
>  fs/ceph/xattr.c                    |  44 ++++++
>  include/linux/ceph/ceph_features.h |   3 +-
>  include/linux/ceph/ceph_fs.h       |  17 +++
>  12 files changed, 437 insertions(+), 3 deletions(-)
>  create mode 100644 fs/ceph/quota.c
>
Hi Luis.

This series looks good. I will add them to our testing branch if my
local test goes well.

I also wish to do some optimization (incremental patches) to the code

1. avoid quota check if there is no quota in the filesystem. we can
use a variable to tracker how many snaprealm have quota enabled.
2. save inode pointer  (not increase inode's reference count) in
ceph_snap_realm data structure. so we can avoid calling
ceph_find_inode() in check_quota_exceeded()

Regards
Yan, Zheng


> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 0/6] ceph: kernel client cephfs quota support
  2018-01-08 12:55 ` [PATCH v4 0/6] ceph: kernel client cephfs quota support Yan, Zheng
@ 2018-01-08 15:00   ` Luis Henriques
  2018-01-09 17:24   ` Luis Henriques
  1 sibling, 0 replies; 11+ messages in thread
From: Luis Henriques @ 2018-01-08 15:00 UTC (permalink / raw)
  To: Yan, Zheng; +Cc: ceph-devel, Yan, Zheng, Jeff Layton, Jan Fajerski

"Yan, Zheng" <ukernel@gmail.com> writes:

> On Fri, Jan 5, 2018 at 6:47 PM, Luis Henriques <lhenriques@suse.com> wrote:
<snip>
> Hi Luis.
>
> This series looks good. I will add them to our testing branch if my
> local test goes well.

Awesome, thanks a lot for your review.  Please let me know if you find
any issues during testing.

>
> I also wish to do some optimization (incremental patches) to the code
>
> 1. avoid quota check if there is no quota in the filesystem. we can
> use a variable to tracker how many snaprealm have quota enabled.
> 2. save inode pointer  (not increase inode's reference count) in
> ceph_snap_realm data structure. so we can avoid calling
> ceph_find_inode() in check_quota_exceeded()

Yeah, these changes seem to make sense.  I'll have a look at them and
see what I can come up with.

Cheers,
-- 
Luis

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 0/6] ceph: kernel client cephfs quota support
  2018-01-08 12:55 ` [PATCH v4 0/6] ceph: kernel client cephfs quota support Yan, Zheng
  2018-01-08 15:00   ` Luis Henriques
@ 2018-01-09 17:24   ` Luis Henriques
  2018-01-10  6:08     ` Yan, Zheng
  1 sibling, 1 reply; 11+ messages in thread
From: Luis Henriques @ 2018-01-09 17:24 UTC (permalink / raw)
  To: Yan, Zheng; +Cc: ceph-devel, Yan, Zheng, Jeff Layton, Jan Fajerski

"Yan, Zheng" <ukernel@gmail.com> writes:

> I also wish to do some optimization (incremental patches) to the code
>
> 1. avoid quota check if there is no quota in the filesystem. we can
> use a variable to tracker how many snaprealm have quota enabled.

I was thinking this should probably be an atomic64_t in struct
ceph_mds_client.  Does this make sense?

Also, it would probably make sense to have the following functions as
inline, defined in super.h:

 ceph_quota_is_max_files_exceeded
 ceph_quota_is_max_bytes_exceeded
 ceph_quota_is_max_bytes_approaching

This way they could simply check the mdsc and return immediately if the
counter is 0.

> 2. save inode pointer  (not increase inode's reference count) in
> ceph_snap_realm data structure. so we can avoid calling
> ceph_find_inode() in check_quota_exceeded()

Would you like me to work on v5 to include these changes, or would you
rather keep v4 and simply have 2 extra patches on top implementing these
optimisations?

Cheers,
-- 
Luis

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 0/6] ceph: kernel client cephfs quota support
  2018-01-09 17:24   ` Luis Henriques
@ 2018-01-10  6:08     ` Yan, Zheng
  0 siblings, 0 replies; 11+ messages in thread
From: Yan, Zheng @ 2018-01-10  6:08 UTC (permalink / raw)
  To: Luis Henriques; +Cc: ceph-devel, Yan, Zheng, Jeff Layton, Jan Fajerski

On Wed, Jan 10, 2018 at 1:24 AM, Luis Henriques <lhenriques@suse.com> wrote:
> "Yan, Zheng" <ukernel@gmail.com> writes:
>
>> I also wish to do some optimization (incremental patches) to the code
>>
>> 1. avoid quota check if there is no quota in the filesystem. we can
>> use a variable to tracker how many snaprealm have quota enabled.
>
> I was thinking this should probably be an atomic64_t in struct
> ceph_mds_client.  Does this make sense?
>
yes


> Also, it would probably make sense to have the following functions as
> inline, defined in super.h:
>
>  ceph_quota_is_max_files_exceeded
>  ceph_quota_is_max_bytes_exceeded
>  ceph_quota_is_max_bytes_approaching
>
> This way they could simply check the mdsc and return immediately if the
> counter is 0.
>
>> 2. save inode pointer  (not increase inode's reference count) in
>> ceph_snap_realm data structure. so we can avoid calling
>> ceph_find_inode() in check_quota_exceeded()
>
> Would you like me to work on v5 to include these changes, or would you
> rather keep v4 and simply have 2 extra patches on top implementing these
> optimisations?

I'd like to keep v2 and have extra patches

Regards
Yan, Zheng

>
> Cheers,
> --
> Luis

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2018-01-10  6:08 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-01-05 10:47 [PATCH v4 0/6] ceph: kernel client cephfs quota support Luis Henriques
2018-01-05 10:47 ` [PATCH v4 1/6] ceph: quota: add initial infrastructure to support cephfs quotas Luis Henriques
2018-01-05 10:47 ` [PATCH v4 2/6] ceph: quota: support for ceph.quota.max_files Luis Henriques
2018-01-05 10:47 ` [PATCH v4 3/6] ceph: quota: don't allow cross-quota renames Luis Henriques
2018-01-05 10:47 ` [PATCH v4 4/6] ceph: quota: support for ceph.quota.max_bytes Luis Henriques
2018-01-05 10:47 ` [PATCH v4 5/6] ceph: quota: update MDS when max_bytes is approaching Luis Henriques
2018-01-05 10:47 ` [PATCH v4 6/6] ceph: quota: add quotas to the in-tree cephfs documentation Luis Henriques
2018-01-08 12:55 ` [PATCH v4 0/6] ceph: kernel client cephfs quota support Yan, Zheng
2018-01-08 15:00   ` Luis Henriques
2018-01-09 17:24   ` Luis Henriques
2018-01-10  6:08     ` Yan, Zheng

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.