* [PATCH 1/5] Add unlocked version of inode_add_bytes() function
@ 2009-12-14 12:21 Dmitry Monakhov
2009-12-14 12:21 ` [PATCH 2/5] quota: decouple fs reserved space from quota reservation. [V5] Dmitry Monakhov
2009-12-14 17:21 ` [PATCH 1/5] Add unlocked version of inode_add_bytes() function Jan Kara
0 siblings, 2 replies; 14+ messages in thread
From: Dmitry Monakhov @ 2009-12-14 12:21 UTC (permalink / raw)
To: Jan Kara; +Cc: linux-fsdevel, Dmitry Monakhov
Quota code requires unlocked version of this function. Off course
we can just copy-paste the code, but copy-pasting is always an evil.
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
---
fs/stat.c | 10 ++++++++--
include/linux/fs.h | 1 +
2 files changed, 9 insertions(+), 2 deletions(-)
diff --git a/fs/stat.c b/fs/stat.c
index 075694e..c4ecd52 100644
--- a/fs/stat.c
+++ b/fs/stat.c
@@ -401,9 +401,9 @@ SYSCALL_DEFINE4(fstatat64, int, dfd, char __user *, filename,
}
#endif /* __ARCH_WANT_STAT64 */
-void inode_add_bytes(struct inode *inode, loff_t bytes)
+/* Caller is here responsible for sufficient locking (ie. inode->i_lock) */
+void __inode_add_bytes(struct inode *inode, loff_t bytes)
{
- spin_lock(&inode->i_lock);
inode->i_blocks += bytes >> 9;
bytes &= 511;
inode->i_bytes += bytes;
@@ -411,6 +411,12 @@ void inode_add_bytes(struct inode *inode, loff_t bytes)
inode->i_blocks++;
inode->i_bytes -= 512;
}
+}
+
+void inode_add_bytes(struct inode *inode, loff_t bytes)
+{
+ spin_lock(&inode->i_lock);
+ __inode_add_bytes(inode, bytes);
spin_unlock(&inode->i_lock);
}
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 2620a8c..98ea200 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2314,6 +2314,7 @@ extern const struct inode_operations page_symlink_inode_operations;
extern int generic_readlink(struct dentry *, char __user *, int);
extern void generic_fillattr(struct inode *, struct kstat *);
extern int vfs_getattr(struct vfsmount *, struct dentry *, struct kstat *);
+void __inode_add_bytes(struct inode *inode, loff_t bytes);
void inode_add_bytes(struct inode *inode, loff_t bytes);
void inode_sub_bytes(struct inode *inode, loff_t bytes);
loff_t inode_get_bytes(struct inode *inode);
--
1.6.0.4
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 2/5] quota: decouple fs reserved space from quota reservation. [V5]
2009-12-14 12:21 [PATCH 1/5] Add unlocked version of inode_add_bytes() function Dmitry Monakhov
@ 2009-12-14 12:21 ` Dmitry Monakhov
2009-12-14 12:21 ` [PATCH 3/5] ext4: Convert to generic reserved quota's space management. [V5] Dmitry Monakhov
2009-12-14 17:24 ` [PATCH 2/5] quota: decouple fs reserved space from quota reservation. [V5] Jan Kara
2009-12-14 17:21 ` [PATCH 1/5] Add unlocked version of inode_add_bytes() function Jan Kara
1 sibling, 2 replies; 14+ messages in thread
From: Dmitry Monakhov @ 2009-12-14 12:21 UTC (permalink / raw)
To: Jan Kara; +Cc: linux-fsdevel, Dmitry Monakhov, linux-ext4
Currently inode_reservation is managed by fs itself and this
reservation is transfered on dquot_transfer(). This means what
inode_reservation must always be in sync with
dquot->dq_dqb.dqb_rsvspace. Otherwise dquot_transfer() will result
in incorrect quota(WARN_ON in dquot_claim_reserved_space() will be
triggered)
This is not easy because of complex locking order issues
for example http://bugzilla.kernel.org/show_bug.cgi?id=14739
The patch introduce quota reservation field for each fs-inode
(fs specific inode is used in order to prevent bloating generic
vfs inode). This reservation is managed by quota code internally
similar to i_blocks/i_bytes and may not be always in sync with
internal fs reservation.
Also perform some code rearrangement:
- Unify dquot_reserve_space() and dquot_reserve_space()
- Unify dquot_release_reserved_space() and dquot_free_space()
- Also this patch add missing warning update to release_rsv()
dquot_release_reserved_space() must call flush_warnings() as
dquot_free_space() does.
Changes from V4
- fixes and cleanups according to Jan's comments.
Changes from V3:
- fix deadlock in dquota_alloc_space in journalled mode with nodelalloc
Changes from V1:
- move qutoa_reservation field from vfs_inode to fs_inode
- account reservation in bytes instead of blocks.
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
---
fs/quota/dquot.c | 213 +++++++++++++++++++++++++++----------------------
include/linux/quota.h | 5 +-
2 files changed, 122 insertions(+), 96 deletions(-)
diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
index 39b49c4..942ea0b 100644
--- a/fs/quota/dquot.c
+++ b/fs/quota/dquot.c
@@ -1388,6 +1388,67 @@ void vfs_dq_drop(struct inode *inode)
EXPORT_SYMBOL(vfs_dq_drop);
/*
+ * inode_reserved_space is managed internally by quota, and protected by
+ * i_lock similar to i_blocks+i_bytes.
+ */
+static qsize_t *inode_reserved_space(struct inode * inode)
+{
+ /* Filesystem must explicitly define it's own method in order to use
+ * quota reservation interface */
+ BUG_ON(!inode->i_sb->dq_op->get_reserved_space);
+ return inode->i_sb->dq_op->get_reserved_space(inode);
+}
+
+static void inode_add_rsv_space(struct inode *inode, qsize_t number)
+{
+ spin_lock(&inode->i_lock);
+ *inode_reserved_space(inode) += number;
+ spin_unlock(&inode->i_lock);
+}
+
+
+static void inode_claim_rsv_space(struct inode *inode, qsize_t number)
+{
+ spin_lock(&inode->i_lock);
+ *inode_reserved_space(inode) -= number;
+ __inode_add_bytes(inode, number);
+ spin_unlock(&inode->i_lock);
+}
+
+static void inode_sub_rsv_space(struct inode *inode, qsize_t number)
+{
+ spin_lock(&inode->i_lock);
+ *inode_reserved_space(inode) -= number;
+ spin_unlock(&inode->i_lock);
+}
+
+static qsize_t inode_get_rsv_space(struct inode *inode)
+{
+ qsize_t ret;
+ spin_lock(&inode->i_lock);
+ ret = *inode_reserved_space(inode);
+ spin_unlock(&inode->i_lock);
+ return ret;
+}
+
+static void inode_incr_space(struct inode *inode, qsize_t number,
+ int reserve)
+{
+ if (reserve)
+ inode_add_rsv_space(inode, number);
+ else
+ inode_add_bytes(inode, number);
+}
+
+static void inode_decr_space(struct inode *inode, qsize_t number, int reserve)
+{
+ if (reserve)
+ inode_sub_rsv_space(inode, number);
+ else
+ inode_sub_bytes(inode, number);
+}
+
+/*
* Following four functions update i_blocks+i_bytes fields and
* quota information (together with appropriate checks)
* NOTE: We absolutely rely on the fact that caller dirties
@@ -1405,6 +1466,21 @@ int __dquot_alloc_space(struct inode *inode, qsize_t number,
int cnt, ret = QUOTA_OK;
char warntype[MAXQUOTAS];
+ /*
+ * First test before acquiring mutex - solves deadlocks when we
+ * re-enter the quota code and are already holding the mutex
+ */
+ if (IS_NOQUOTA(inode)) {
+ inode_incr_space(inode, number, reserve);
+ goto out;
+ }
+
+ down_read(&sb_dqopt(inode->i_sb)->dqptr_sem);
+ if (IS_NOQUOTA(inode)) {
+ inode_incr_space(inode, number, reserve);
+ goto out_unlock;
+ }
+
for (cnt = 0; cnt < MAXQUOTAS; cnt++)
warntype[cnt] = QUOTA_NL_NOWARN;
@@ -1415,7 +1491,8 @@ int __dquot_alloc_space(struct inode *inode, qsize_t number,
if (check_bdq(inode->i_dquot[cnt], number, warn, warntype+cnt)
== NO_QUOTA) {
ret = NO_QUOTA;
- goto out_unlock;
+ spin_unlock(&dq_data_lock);
+ goto out_flush_warn;
}
}
for (cnt = 0; cnt < MAXQUOTAS; cnt++) {
@@ -1426,64 +1503,32 @@ int __dquot_alloc_space(struct inode *inode, qsize_t number,
else
dquot_incr_space(inode->i_dquot[cnt], number);
}
- if (!reserve)
- inode_add_bytes(inode, number);
-out_unlock:
+ inode_incr_space(inode, number, reserve);
spin_unlock(&dq_data_lock);
- flush_warnings(inode->i_dquot, warntype);
- return ret;
-}
-
-int dquot_alloc_space(struct inode *inode, qsize_t number, int warn)
-{
- int cnt, ret = QUOTA_OK;
-
- /*
- * First test before acquiring mutex - solves deadlocks when we
- * re-enter the quota code and are already holding the mutex
- */
- if (IS_NOQUOTA(inode)) {
- inode_add_bytes(inode, number);
- goto out;
- }
-
- down_read(&sb_dqopt(inode->i_sb)->dqptr_sem);
- if (IS_NOQUOTA(inode)) {
- inode_add_bytes(inode, number);
- goto out_unlock;
- }
-
- ret = __dquot_alloc_space(inode, number, warn, 0);
- if (ret == NO_QUOTA)
- goto out_unlock;
+ if (reserve)
+ goto out_flush_warn;
/* Dirtify all the dquots - this can block when journalling */
for (cnt = 0; cnt < MAXQUOTAS; cnt++)
if (inode->i_dquot[cnt])
mark_dquot_dirty(inode->i_dquot[cnt]);
+out_flush_warn:
+ flush_warnings(inode->i_dquot, warntype);
out_unlock:
up_read(&sb_dqopt(inode->i_sb)->dqptr_sem);
out:
return ret;
}
+
+int dquot_alloc_space(struct inode *inode, qsize_t number, int warn)
+{
+ return __dquot_alloc_space(inode, number, warn, 0);
+}
EXPORT_SYMBOL(dquot_alloc_space);
int dquot_reserve_space(struct inode *inode, qsize_t number, int warn)
{
- int ret = QUOTA_OK;
-
- if (IS_NOQUOTA(inode))
- goto out;
-
- down_read(&sb_dqopt(inode->i_sb)->dqptr_sem);
- if (IS_NOQUOTA(inode))
- goto out_unlock;
-
- ret = __dquot_alloc_space(inode, number, warn, 1);
-out_unlock:
- up_read(&sb_dqopt(inode->i_sb)->dqptr_sem);
-out:
- return ret;
+ return __dquot_alloc_space(inode, number, warn, 1);
}
EXPORT_SYMBOL(dquot_reserve_space);
@@ -1540,14 +1585,14 @@ int dquot_claim_space(struct inode *inode, qsize_t number)
int ret = QUOTA_OK;
if (IS_NOQUOTA(inode)) {
- inode_add_bytes(inode, number);
+ inode_claim_rsv_space(inode, number);
goto out;
}
down_read(&sb_dqopt(inode->i_sb)->dqptr_sem);
if (IS_NOQUOTA(inode)) {
up_read(&sb_dqopt(inode->i_sb)->dqptr_sem);
- inode_add_bytes(inode, number);
+ inode_claim_rsv_space(inode, number);
goto out;
}
@@ -1559,7 +1604,7 @@ int dquot_claim_space(struct inode *inode, qsize_t number)
number);
}
/* Update inode bytes */
- inode_add_bytes(inode, number);
+ inode_claim_rsv_space(inode, number);
spin_unlock(&dq_data_lock);
/* Dirtify all the dquots - this can block when journalling */
for (cnt = 0; cnt < MAXQUOTAS; cnt++)
@@ -1572,38 +1617,9 @@ out:
EXPORT_SYMBOL(dquot_claim_space);
/*
- * Release reserved quota space
- */
-void dquot_release_reserved_space(struct inode *inode, qsize_t number)
-{
- int cnt;
-
- if (IS_NOQUOTA(inode))
- goto out;
-
- down_read(&sb_dqopt(inode->i_sb)->dqptr_sem);
- if (IS_NOQUOTA(inode))
- goto out_unlock;
-
- spin_lock(&dq_data_lock);
- /* Release reserved dquots */
- for (cnt = 0; cnt < MAXQUOTAS; cnt++) {
- if (inode->i_dquot[cnt])
- dquot_free_reserved_space(inode->i_dquot[cnt], number);
- }
- spin_unlock(&dq_data_lock);
-
-out_unlock:
- up_read(&sb_dqopt(inode->i_sb)->dqptr_sem);
-out:
- return;
-}
-EXPORT_SYMBOL(dquot_release_reserved_space);
-
-/*
* This operation can block, but only after everything is updated
*/
-int dquot_free_space(struct inode *inode, qsize_t number)
+int __dquot_free_space(struct inode *inode, qsize_t number, int reserve)
{
unsigned int cnt;
char warntype[MAXQUOTAS];
@@ -1612,7 +1628,7 @@ int dquot_free_space(struct inode *inode, qsize_t number)
* re-enter the quota code and are already holding the mutex */
if (IS_NOQUOTA(inode)) {
out_sub:
- inode_sub_bytes(inode, number);
+ inode_decr_space(inode, number, reserve);
return QUOTA_OK;
}
@@ -1627,21 +1643,43 @@ out_sub:
if (!inode->i_dquot[cnt])
continue;
warntype[cnt] = info_bdq_free(inode->i_dquot[cnt], number);
- dquot_decr_space(inode->i_dquot[cnt], number);
+ if (reserve)
+ dquot_free_reserved_space(inode->i_dquot[cnt], number);
+ else
+ dquot_decr_space(inode->i_dquot[cnt], number);
}
- inode_sub_bytes(inode, number);
+ inode_decr_space(inode, number, reserve);
spin_unlock(&dq_data_lock);
+
+ if (reserve)
+ goto out_unlock;
/* Dirtify all the dquots - this can block when journalling */
for (cnt = 0; cnt < MAXQUOTAS; cnt++)
if (inode->i_dquot[cnt])
mark_dquot_dirty(inode->i_dquot[cnt]);
+out_unlock:
flush_warnings(inode->i_dquot, warntype);
up_read(&sb_dqopt(inode->i_sb)->dqptr_sem);
return QUOTA_OK;
}
+
+int dquot_free_space(struct inode *inode, qsize_t number)
+{
+ return __dquot_free_space(inode, number, 0);
+}
EXPORT_SYMBOL(dquot_free_space);
/*
+ * Release reserved quota space
+ */
+void dquot_release_reserved_space(struct inode *inode, qsize_t number)
+{
+ return (void )__dquot_free_space(inode, number, 1);
+
+}
+EXPORT_SYMBOL(dquot_release_reserved_space);
+
+/*
* This operation can block, but only after everything is updated
*/
int dquot_free_inode(const struct inode *inode, qsize_t number)
@@ -1679,19 +1717,6 @@ int dquot_free_inode(const struct inode *inode, qsize_t number)
EXPORT_SYMBOL(dquot_free_inode);
/*
- * call back function, get reserved quota space from underlying fs
- */
-qsize_t dquot_get_reserved_space(struct inode *inode)
-{
- qsize_t reserved_space = 0;
-
- if (sb_any_quota_active(inode->i_sb) &&
- inode->i_sb->dq_op->get_reserved_space)
- reserved_space = inode->i_sb->dq_op->get_reserved_space(inode);
- return reserved_space;
-}
-
-/*
* Transfer the number of inode and blocks from one diskquota to an other.
*
* This operation can block, but only after everything is updated
@@ -1734,7 +1759,7 @@ int dquot_transfer(struct inode *inode, struct iattr *iattr)
}
spin_lock(&dq_data_lock);
cur_space = inode_get_bytes(inode);
- rsv_space = dquot_get_reserved_space(inode);
+ rsv_space = inode_get_rsv_space(inode);
space = cur_space + rsv_space;
/* Build the transfer_from list and check the limits */
for (cnt = 0; cnt < MAXQUOTAS; cnt++) {
diff --git a/include/linux/quota.h b/include/linux/quota.h
index 78c4889..8fd8efc 100644
--- a/include/linux/quota.h
+++ b/include/linux/quota.h
@@ -313,8 +313,9 @@ struct dquot_operations {
int (*claim_space) (struct inode *, qsize_t);
/* release rsved quota for delayed alloc */
void (*release_rsv) (struct inode *, qsize_t);
- /* get reserved quota for delayed alloc */
- qsize_t (*get_reserved_space) (struct inode *);
+ /* get reserved quota for delayed alloc, value returned is managed by
+ * quota code only */
+ qsize_t *(*get_reserved_space) (struct inode *);
};
/* Operations handling requests from userspace */
--
1.6.0.4
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 3/5] ext4: Convert to generic reserved quota's space management. [V5]
2009-12-14 12:21 ` [PATCH 2/5] quota: decouple fs reserved space from quota reservation. [V5] Dmitry Monakhov
@ 2009-12-14 12:21 ` Dmitry Monakhov
2009-12-14 12:21 ` [PATCH 4/5] quota: Move duplicated code to separate functions Dmitry Monakhov
` (2 more replies)
2009-12-14 17:24 ` [PATCH 2/5] quota: decouple fs reserved space from quota reservation. [V5] Jan Kara
1 sibling, 3 replies; 14+ messages in thread
From: Dmitry Monakhov @ 2009-12-14 12:21 UTC (permalink / raw)
To: Jan Kara; +Cc: linux-fsdevel, Dmitry Monakhov, linux-ext4
This patch fix write vs chown race condition.
Changes from V4
- Coding style cleanup according to Jan's comment.
Changes from V2:
- add missed i_reserved_quota iniatilization.
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
---
fs/ext4/ext4.h | 6 +++++-
fs/ext4/inode.c | 16 +++++++---------
fs/ext4/super.c | 3 +++
3 files changed, 15 insertions(+), 10 deletions(-)
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 26d3cf8..708496f 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -705,6 +705,10 @@ struct ext4_inode_info {
struct list_head i_aio_dio_complete_list;
/* current io_end structure for async DIO write*/
ext4_io_end_t *cur_aio_dio;
+#ifdef CONFIG_QUOTA
+ /* quota space reservation, managed internally by quota code */
+ qsize_t i_reserved_quota;
+#endif
};
/*
@@ -1427,7 +1431,7 @@ extern int ext4_chunk_trans_blocks(struct inode *, int nrblocks);
extern int ext4_block_truncate_page(handle_t *handle,
struct address_space *mapping, loff_t from);
extern int ext4_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf);
-extern qsize_t ext4_get_reserved_space(struct inode *inode);
+extern qsize_t *ext4_get_reserved_space(struct inode *inode);
extern int flush_aio_dio_completed_IO(struct inode *inode);
/* ioctl.c */
extern long ext4_ioctl(struct file *, unsigned int, unsigned long);
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index d99173a..8254fe6 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1048,17 +1048,12 @@ out:
return err;
}
-qsize_t ext4_get_reserved_space(struct inode *inode)
+#ifdef CONFIG_QUOTA
+qsize_t *ext4_get_reserved_space(struct inode *inode)
{
- unsigned long long total;
-
- spin_lock(&EXT4_I(inode)->i_block_reservation_lock);
- total = EXT4_I(inode)->i_reserved_data_blocks +
- EXT4_I(inode)->i_reserved_meta_blocks;
- spin_unlock(&EXT4_I(inode)->i_block_reservation_lock);
-
- return (total << inode->i_blkbits);
+ return &EXT4_I(inode)->i_reserved_quota;
}
+#endif
/*
* Calculate the number of metadata blocks need to reserve
* to allocate @blocks for non extent file based file
@@ -5048,6 +5043,9 @@ struct inode *ext4_iget(struct super_block *sb, unsigned long ino)
((__u64)le16_to_cpu(raw_inode->i_file_acl_high)) << 32;
inode->i_size = ext4_isize(raw_inode);
ei->i_disksize = inode->i_size;
+#ifdef CONFIG_QUOTA
+ ei->i_reserved_quota = 0;
+#endif
inode->i_generation = le32_to_cpu(raw_inode->i_generation);
ei->i_block_group = iloc.block_group;
ei->i_last_alloc_group = ~0;
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 4c87d97..4697272 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -713,6 +713,9 @@ static struct inode *ext4_alloc_inode(struct super_block *sb)
spin_lock_init(&(ei->i_block_reservation_lock));
INIT_LIST_HEAD(&ei->i_aio_dio_complete_list);
ei->cur_aio_dio = NULL;
+#ifdef CONFIG_QUOTA
+ ei->i_reserved_quota = 0;
+#endif
return &ei->vfs_inode;
}
--
1.6.0.4
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 4/5] quota: Move duplicated code to separate functions
2009-12-14 12:21 ` [PATCH 3/5] ext4: Convert to generic reserved quota's space management. [V5] Dmitry Monakhov
@ 2009-12-14 12:21 ` Dmitry Monakhov
2009-12-14 12:21 ` [PATCH 5/5] quota: handle IO errors in dquot_transfer() Dmitry Monakhov
2009-12-14 18:00 ` [PATCH 4/5] quota: Move duplicated code to separate functions Jan Kara
2009-12-14 14:35 ` [PATCH 3/5] ext4: Convert to generic reserved quota's space management. [V5] tytso
2009-12-14 17:25 ` Jan Kara
2 siblings, 2 replies; 14+ messages in thread
From: Dmitry Monakhov @ 2009-12-14 12:21 UTC (permalink / raw)
To: Jan Kara; +Cc: linux-fsdevel, Dmitry Monakhov
- for(..) { mark_dquot_dirty(); } -> mark_all_dquot_dirty()
- for(..) { dput_all(); } -> dqput_all()
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
---
fs/quota/dquot.c | 75 ++++++++++++++++++++++++++----------------------------
1 files changed, 36 insertions(+), 39 deletions(-)
diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
index 942ea0b..d6d535c 100644
--- a/fs/quota/dquot.c
+++ b/fs/quota/dquot.c
@@ -327,6 +327,28 @@ int dquot_mark_dquot_dirty(struct dquot *dquot)
}
EXPORT_SYMBOL(dquot_mark_dquot_dirty);
+/* Dirtify all the dquots - this can block when journalling */
+static inline int mark_all_dquot_dirty(struct dquot **dquot)
+{
+ int ret, err, cnt;
+ ret = err = 0;
+ for (cnt = 0; cnt < MAXQUOTAS; cnt++) {
+ if (dquot[cnt])
+ /* Even in case of error we have to continue */
+ ret = mark_dquot_dirty(dquot[cnt]);
+ if (!err)
+ err = ret;
+ }
+ return err;
+}
+
+static inline void dqput_all(struct dquot **dquot)
+{
+ unsigned int cnt;
+ for (cnt = 0; cnt < MAXQUOTAS; cnt++)
+ dqput(dquot[cnt]);
+}
+
/* This function needs dq_list_lock */
static inline int clear_dquot_dirty(struct dquot *dquot)
{
@@ -1337,8 +1359,7 @@ int dquot_initialize(struct inode *inode, int type)
out_err:
up_write(&sb_dqopt(sb)->dqptr_sem);
/* Drop unused references */
- for (cnt = 0; cnt < MAXQUOTAS; cnt++)
- dqput(got[cnt]);
+ dqput_all(got);
return ret;
}
EXPORT_SYMBOL(dquot_initialize);
@@ -1357,9 +1378,7 @@ int dquot_drop(struct inode *inode)
inode->i_dquot[cnt] = NULL;
}
up_write(&sb_dqopt(inode->i_sb)->dqptr_sem);
-
- for (cnt = 0; cnt < MAXQUOTAS; cnt++)
- dqput(put[cnt]);
+ dqput_all(put);
return 0;
}
EXPORT_SYMBOL(dquot_drop);
@@ -1508,10 +1527,7 @@ int __dquot_alloc_space(struct inode *inode, qsize_t number,
if (reserve)
goto out_flush_warn;
- /* Dirtify all the dquots - this can block when journalling */
- for (cnt = 0; cnt < MAXQUOTAS; cnt++)
- if (inode->i_dquot[cnt])
- mark_dquot_dirty(inode->i_dquot[cnt]);
+ mark_all_dquot_dirty(inode->i_dquot);
out_flush_warn:
flush_warnings(inode->i_dquot, warntype);
out_unlock:
@@ -1569,10 +1585,7 @@ int dquot_alloc_inode(const struct inode *inode, qsize_t number)
warn_put_all:
spin_unlock(&dq_data_lock);
if (ret == QUOTA_OK)
- /* Dirtify all the dquots - this can block when journalling */
- for (cnt = 0; cnt < MAXQUOTAS; cnt++)
- if (inode->i_dquot[cnt])
- mark_dquot_dirty(inode->i_dquot[cnt]);
+ mark_all_dquot_dirty((struct dquot **)inode->i_dquot);
flush_warnings(inode->i_dquot, warntype);
up_read(&sb_dqopt(inode->i_sb)->dqptr_sem);
return ret;
@@ -1606,10 +1619,7 @@ int dquot_claim_space(struct inode *inode, qsize_t number)
/* Update inode bytes */
inode_claim_rsv_space(inode, number);
spin_unlock(&dq_data_lock);
- /* Dirtify all the dquots - this can block when journalling */
- for (cnt = 0; cnt < MAXQUOTAS; cnt++)
- if (inode->i_dquot[cnt])
- mark_dquot_dirty(inode->i_dquot[cnt]);
+ mark_all_dquot_dirty(inode->i_dquot);
up_read(&sb_dqopt(inode->i_sb)->dqptr_sem);
out:
return ret;
@@ -1653,10 +1663,7 @@ out_sub:
if (reserve)
goto out_unlock;
- /* Dirtify all the dquots - this can block when journalling */
- for (cnt = 0; cnt < MAXQUOTAS; cnt++)
- if (inode->i_dquot[cnt])
- mark_dquot_dirty(inode->i_dquot[cnt]);
+ mark_all_dquot_dirty(inode->i_dquot);
out_unlock:
flush_warnings(inode->i_dquot, warntype);
up_read(&sb_dqopt(inode->i_sb)->dqptr_sem);
@@ -1706,10 +1713,7 @@ int dquot_free_inode(const struct inode *inode, qsize_t number)
dquot_decr_inodes(inode->i_dquot[cnt], number);
}
spin_unlock(&dq_data_lock);
- /* Dirtify all the dquots - this can block when journalling */
- for (cnt = 0; cnt < MAXQUOTAS; cnt++)
- if (inode->i_dquot[cnt])
- mark_dquot_dirty(inode->i_dquot[cnt]);
+ mark_all_dquot_dirty((struct dquot **)inode->i_dquot);
flush_warnings(inode->i_dquot, warntype);
up_read(&sb_dqopt(inode->i_sb)->dqptr_sem);
return QUOTA_OK;
@@ -1803,25 +1807,18 @@ int dquot_transfer(struct inode *inode, struct iattr *iattr)
spin_unlock(&dq_data_lock);
up_write(&sb_dqopt(inode->i_sb)->dqptr_sem);
- /* Dirtify all the dquots - this can block when journalling */
- for (cnt = 0; cnt < MAXQUOTAS; cnt++) {
- if (transfer_from[cnt])
- mark_dquot_dirty(transfer_from[cnt]);
- if (transfer_to[cnt]) {
- mark_dquot_dirty(transfer_to[cnt]);
- /* The reference we got is transferred to the inode */
- transfer_to[cnt] = NULL;
- }
- }
+ mark_all_dquot_dirty(transfer_from);
+ mark_all_dquot_dirty(transfer_to);
+ /* The reference we got is transferred to the inode */
+ for (cnt = 0; cnt < MAXQUOTAS; cnt++)
+ transfer_to[cnt] = NULL;
warn_put_all:
flush_warnings(transfer_to, warntype_to);
flush_warnings(transfer_from, warntype_from_inodes);
flush_warnings(transfer_from, warntype_from_space);
put_all:
- for (cnt = 0; cnt < MAXQUOTAS; cnt++) {
- dqput(transfer_from[cnt]);
- dqput(transfer_to[cnt]);
- }
+ dqput_all(transfer_from);
+ dqput_all(transfer_to);
return ret;
over_quota:
spin_unlock(&dq_data_lock);
--
1.6.0.4
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 5/5] quota: handle IO errors in dquot_transfer()
2009-12-14 12:21 ` [PATCH 4/5] quota: Move duplicated code to separate functions Dmitry Monakhov
@ 2009-12-14 12:21 ` Dmitry Monakhov
2009-12-14 18:41 ` Jan Kara
2009-12-14 18:00 ` [PATCH 4/5] quota: Move duplicated code to separate functions Jan Kara
1 sibling, 1 reply; 14+ messages in thread
From: Dmitry Monakhov @ 2009-12-14 12:21 UTC (permalink / raw)
To: Jan Kara; +Cc: linux-fsdevel, Dmitry Monakhov
transfer_to[cnt] = dqget() may returns NULL due to IO error.
But NULL value in transfer_to[cnt] means a dquot transfer
optimization. So after operation succeed inode will have new
i_uid or i_gid but accounted to old dquot. This behaviour
is differ from dquot_initialize(). Let's handle IO error from
dqget() equally in all functions.
In appliance to dquot_transfer() this means that we have to finish
operation regardless to IO errors from dqget().
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
---
fs/quota/dquot.c | 46 +++++++++++++++++++++++++++-------------------
1 files changed, 27 insertions(+), 19 deletions(-)
diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
index d6d535c..fbbaa4e 100644
--- a/fs/quota/dquot.c
+++ b/fs/quota/dquot.c
@@ -1733,27 +1733,29 @@ int dquot_transfer(struct inode *inode, struct iattr *iattr)
struct dquot *transfer_from[MAXQUOTAS];
struct dquot *transfer_to[MAXQUOTAS];
int cnt, ret = QUOTA_OK;
- int chuid = iattr->ia_valid & ATTR_UID && inode->i_uid != iattr->ia_uid,
- chgid = iattr->ia_valid & ATTR_GID && inode->i_gid != iattr->ia_gid;
char warntype_to[MAXQUOTAS];
char warntype_from_inodes[MAXQUOTAS], warntype_from_space[MAXQUOTAS];
+ char valid[MAXQUOTAS];
/* First test before acquiring mutex - solves deadlocks when we
* re-enter the quota code and are already holding the mutex */
if (IS_NOQUOTA(inode))
return QUOTA_OK;
/* Initialize the arrays */
+ memset(transfer_to, 0, sizeof(transfer_to));
+ memset(transfer_from, 0, sizeof(transfer_from));
+ memset(valid, 0, sizeof(valid));
+ valid[USRQUOTA] = iattr->ia_valid & ATTR_UID &&
+ inode->i_uid != iattr->ia_uid,
+ valid[GRPQUOTA] = iattr->ia_valid & ATTR_GID &&
+ inode->i_gid != iattr->ia_gid;
+
for (cnt = 0; cnt < MAXQUOTAS; cnt++) {
- transfer_from[cnt] = NULL;
- transfer_to[cnt] = NULL;
warntype_to[cnt] = QUOTA_NL_NOWARN;
+ if (!valid[cnt])
+ continue;
+ transfer_to[cnt] = dqget(inode->i_sb, iattr->ia_uid, cnt);
}
- if (chuid)
- transfer_to[USRQUOTA] = dqget(inode->i_sb, iattr->ia_uid,
- USRQUOTA);
- if (chgid)
- transfer_to[GRPQUOTA] = dqget(inode->i_sb, iattr->ia_gid,
- GRPQUOTA);
down_write(&sb_dqopt(inode->i_sb)->dqptr_sem);
/* Now recheck reliably when holding dqptr_sem */
@@ -1767,9 +1769,11 @@ int dquot_transfer(struct inode *inode, struct iattr *iattr)
space = cur_space + rsv_space;
/* Build the transfer_from list and check the limits */
for (cnt = 0; cnt < MAXQUOTAS; cnt++) {
- if (!transfer_to[cnt])
+ if (!valid[cnt])
continue;
transfer_from[cnt] = inode->i_dquot[cnt];
+ if (!transfer_to[cnt])
+ continue;
if (check_idq(transfer_to[cnt], 1, warntype_to + cnt) ==
NO_QUOTA || check_bdq(transfer_to[cnt], space, 0,
warntype_to + cnt) == NO_QUOTA)
@@ -1783,10 +1787,15 @@ int dquot_transfer(struct inode *inode, struct iattr *iattr)
/*
* Skip changes for same uid or gid or for turned off quota-type.
*/
- if (!transfer_to[cnt])
+ if (!valid[cnt])
continue;
+ /*
+ * Due to IO error we might not have transfer_to[] or
+ * transfer_from[] structure. Nor than less the operation must
+ * being done regardless quota io errors.
+ */
+ inode->i_dquot[cnt] = transfer_to[cnt];
- /* Due to IO error we might not have transfer_from[] structure */
if (transfer_from[cnt]) {
warntype_from_inodes[cnt] =
info_idq_free(transfer_from[cnt], 1);
@@ -1797,12 +1806,11 @@ int dquot_transfer(struct inode *inode, struct iattr *iattr)
dquot_free_reserved_space(transfer_from[cnt],
rsv_space);
}
-
- dquot_incr_inodes(transfer_to[cnt], 1);
- dquot_incr_space(transfer_to[cnt], cur_space);
- dquot_resv_space(transfer_to[cnt], rsv_space);
-
- inode->i_dquot[cnt] = transfer_to[cnt];
+ if (transfer_to[cnt]) {
+ dquot_incr_inodes(transfer_to[cnt], 1);
+ dquot_incr_space(transfer_to[cnt], cur_space);
+ dquot_resv_space(transfer_to[cnt], rsv_space);
+ }
}
spin_unlock(&dq_data_lock);
up_write(&sb_dqopt(inode->i_sb)->dqptr_sem);
--
1.6.0.4
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 3/5] ext4: Convert to generic reserved quota's space management. [V5]
2009-12-14 12:21 ` [PATCH 3/5] ext4: Convert to generic reserved quota's space management. [V5] Dmitry Monakhov
2009-12-14 12:21 ` [PATCH 4/5] quota: Move duplicated code to separate functions Dmitry Monakhov
@ 2009-12-14 14:35 ` tytso
2009-12-14 17:26 ` Jan Kara
2009-12-14 17:25 ` Jan Kara
2 siblings, 1 reply; 14+ messages in thread
From: tytso @ 2009-12-14 14:35 UTC (permalink / raw)
To: Dmitry Monakhov; +Cc: Jan Kara, linux-fsdevel, linux-ext4
On Mon, Dec 14, 2009 at 03:21:14PM +0300, Dmitry Monakhov wrote:
> This patch fix write vs chown race condition.
>
> Changes from V4
> - Coding style cleanup according to Jan's comment.
> Changes from V2:
> - add missed i_reserved_quota iniatilization.
>
> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Acked-by: "Theodore Ts'o" <tytso@mit.edu>
Jan, I assume you'll push this along with the other quota-related
changes out from your tree? We have some ext4-related quota bugs that
I think may be related to this fix (and a bunch of distros are
planning on using 2.6.32), so if you could review this and get pushed
this patch series pushed out for 2.6.32 it would be appreciated,
thanks.
- Ted
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/5] Add unlocked version of inode_add_bytes() function
2009-12-14 12:21 [PATCH 1/5] Add unlocked version of inode_add_bytes() function Dmitry Monakhov
2009-12-14 12:21 ` [PATCH 2/5] quota: decouple fs reserved space from quota reservation. [V5] Dmitry Monakhov
@ 2009-12-14 17:21 ` Jan Kara
1 sibling, 0 replies; 14+ messages in thread
From: Jan Kara @ 2009-12-14 17:21 UTC (permalink / raw)
To: Dmitry Monakhov; +Cc: Jan Kara, linux-fsdevel
On Mon 14-12-09 15:21:12, Dmitry Monakhov wrote:
> Quota code requires unlocked version of this function. Off course
> we can just copy-paste the code, but copy-pasting is always an evil.
Looks good. Merged into my tree.
Honza
> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
> ---
> fs/stat.c | 10 ++++++++--
> include/linux/fs.h | 1 +
> 2 files changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/fs/stat.c b/fs/stat.c
> index 075694e..c4ecd52 100644
> --- a/fs/stat.c
> +++ b/fs/stat.c
> @@ -401,9 +401,9 @@ SYSCALL_DEFINE4(fstatat64, int, dfd, char __user *, filename,
> }
> #endif /* __ARCH_WANT_STAT64 */
>
> -void inode_add_bytes(struct inode *inode, loff_t bytes)
> +/* Caller is here responsible for sufficient locking (ie. inode->i_lock) */
> +void __inode_add_bytes(struct inode *inode, loff_t bytes)
> {
> - spin_lock(&inode->i_lock);
> inode->i_blocks += bytes >> 9;
> bytes &= 511;
> inode->i_bytes += bytes;
> @@ -411,6 +411,12 @@ void inode_add_bytes(struct inode *inode, loff_t bytes)
> inode->i_blocks++;
> inode->i_bytes -= 512;
> }
> +}
> +
> +void inode_add_bytes(struct inode *inode, loff_t bytes)
> +{
> + spin_lock(&inode->i_lock);
> + __inode_add_bytes(inode, bytes);
> spin_unlock(&inode->i_lock);
> }
>
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 2620a8c..98ea200 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -2314,6 +2314,7 @@ extern const struct inode_operations page_symlink_inode_operations;
> extern int generic_readlink(struct dentry *, char __user *, int);
> extern void generic_fillattr(struct inode *, struct kstat *);
> extern int vfs_getattr(struct vfsmount *, struct dentry *, struct kstat *);
> +void __inode_add_bytes(struct inode *inode, loff_t bytes);
> void inode_add_bytes(struct inode *inode, loff_t bytes);
> void inode_sub_bytes(struct inode *inode, loff_t bytes);
> loff_t inode_get_bytes(struct inode *inode);
> --
> 1.6.0.4
>
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/5] quota: decouple fs reserved space from quota reservation. [V5]
2009-12-14 12:21 ` [PATCH 2/5] quota: decouple fs reserved space from quota reservation. [V5] Dmitry Monakhov
2009-12-14 12:21 ` [PATCH 3/5] ext4: Convert to generic reserved quota's space management. [V5] Dmitry Monakhov
@ 2009-12-14 17:24 ` Jan Kara
1 sibling, 0 replies; 14+ messages in thread
From: Jan Kara @ 2009-12-14 17:24 UTC (permalink / raw)
To: Dmitry Monakhov; +Cc: Jan Kara, linux-fsdevel, linux-ext4
On Mon 14-12-09 15:21:13, Dmitry Monakhov wrote:
> Currently inode_reservation is managed by fs itself and this
> reservation is transfered on dquot_transfer(). This means what
> inode_reservation must always be in sync with
> dquot->dq_dqb.dqb_rsvspace. Otherwise dquot_transfer() will result
> in incorrect quota(WARN_ON in dquot_claim_reserved_space() will be
> triggered)
> This is not easy because of complex locking order issues
> for example http://bugzilla.kernel.org/show_bug.cgi?id=14739
>
> The patch introduce quota reservation field for each fs-inode
> (fs specific inode is used in order to prevent bloating generic
> vfs inode). This reservation is managed by quota code internally
> similar to i_blocks/i_bytes and may not be always in sync with
> internal fs reservation.
>
> Also perform some code rearrangement:
> - Unify dquot_reserve_space() and dquot_reserve_space()
> - Unify dquot_release_reserved_space() and dquot_free_space()
> - Also this patch add missing warning update to release_rsv()
> dquot_release_reserved_space() must call flush_warnings() as
> dquot_free_space() does.
>
> Changes from V4
> - fixes and cleanups according to Jan's comments.
> Changes from V3:
> - fix deadlock in dquota_alloc_space in journalled mode with nodelalloc
> Changes from V1:
> - move qutoa_reservation field from vfs_inode to fs_inode
> - account reservation in bytes instead of blocks.
Looks good. I've just slightly updated changelog and below
typecast (doing just __dquot_free_space(inode, number, 1)) and merged the
patch to my tree.
> +void dquot_release_reserved_space(struct inode *inode, qsize_t number)
> +{
> + return (void )__dquot_free_space(inode, number, 1);
> +
> +}
Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 3/5] ext4: Convert to generic reserved quota's space management. [V5]
2009-12-14 12:21 ` [PATCH 3/5] ext4: Convert to generic reserved quota's space management. [V5] Dmitry Monakhov
2009-12-14 12:21 ` [PATCH 4/5] quota: Move duplicated code to separate functions Dmitry Monakhov
2009-12-14 14:35 ` [PATCH 3/5] ext4: Convert to generic reserved quota's space management. [V5] tytso
@ 2009-12-14 17:25 ` Jan Kara
2 siblings, 0 replies; 14+ messages in thread
From: Jan Kara @ 2009-12-14 17:25 UTC (permalink / raw)
To: Dmitry Monakhov; +Cc: Jan Kara, linux-fsdevel, linux-ext4
On Mon 14-12-09 15:21:14, Dmitry Monakhov wrote:
> This patch fix write vs chown race condition.
>
> Changes from V4
> - Coding style cleanup according to Jan's comment.
> Changes from V2:
> - add missed i_reserved_quota iniatilization.
Looks good, merged into my tree.
Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 3/5] ext4: Convert to generic reserved quota's space management. [V5]
2009-12-14 14:35 ` [PATCH 3/5] ext4: Convert to generic reserved quota's space management. [V5] tytso
@ 2009-12-14 17:26 ` Jan Kara
0 siblings, 0 replies; 14+ messages in thread
From: Jan Kara @ 2009-12-14 17:26 UTC (permalink / raw)
To: tytso; +Cc: Dmitry Monakhov, Jan Kara, linux-fsdevel, linux-ext4
On Mon 14-12-09 09:35:51, tytso@mit.edu wrote:
> On Mon, Dec 14, 2009 at 03:21:14PM +0300, Dmitry Monakhov wrote:
> > This patch fix write vs chown race condition.
> >
> > Changes from V4
> > - Coding style cleanup according to Jan's comment.
> > Changes from V2:
> > - add missed i_reserved_quota iniatilization.
> >
> > Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
>
> Acked-by: "Theodore Ts'o" <tytso@mit.edu>
>
> Jan, I assume you'll push this along with the other quota-related
> changes out from your tree? We have some ext4-related quota bugs that
> I think may be related to this fix (and a bunch of distros are
> planning on using 2.6.32), so if you could review this and get pushed
> this patch series pushed out for 2.6.32 it would be appreciated,
> thanks.
Yes, I'll push it via my tree and take care of pushing into 2.6.32 stable
tree as well.
Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 4/5] quota: Move duplicated code to separate functions
2009-12-14 12:21 ` [PATCH 4/5] quota: Move duplicated code to separate functions Dmitry Monakhov
2009-12-14 12:21 ` [PATCH 5/5] quota: handle IO errors in dquot_transfer() Dmitry Monakhov
@ 2009-12-14 18:00 ` Jan Kara
1 sibling, 0 replies; 14+ messages in thread
From: Jan Kara @ 2009-12-14 18:00 UTC (permalink / raw)
To: Dmitry Monakhov; +Cc: Jan Kara, linux-fsdevel
On Mon 14-12-09 15:21:15, Dmitry Monakhov wrote:
> - for(..) { mark_dquot_dirty(); } -> mark_all_dquot_dirty()
> - for(..) { dput_all(); } -> dqput_all()
>
> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
> ---
> fs/quota/dquot.c | 75 ++++++++++++++++++++++++++----------------------------
> 1 files changed, 36 insertions(+), 39 deletions(-)
>
> diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
> index 942ea0b..d6d535c 100644
> --- a/fs/quota/dquot.c
> +++ b/fs/quota/dquot.c
> @@ -327,6 +327,28 @@ int dquot_mark_dquot_dirty(struct dquot *dquot)
> }
> EXPORT_SYMBOL(dquot_mark_dquot_dirty);
>
> +/* Dirtify all the dquots - this can block when journalling */
> +static inline int mark_all_dquot_dirty(struct dquot **dquot)
> +{
> + int ret, err, cnt;
Added empty line here.
> + ret = err = 0;
> + for (cnt = 0; cnt < MAXQUOTAS; cnt++) {
> + if (dquot[cnt])
> + /* Even in case of error we have to continue */
> + ret = mark_dquot_dirty(dquot[cnt]);
> + if (!err)
> + err = ret;
> + }
> + return err;
> +}
> +
> +static inline void dqput_all(struct dquot **dquot)
> +{
> + unsigned int cnt;
Added empty line here.
> + for (cnt = 0; cnt < MAXQUOTAS; cnt++)
> + dqput(dquot[cnt]);
> +}
...
> @@ -1569,10 +1585,7 @@ int dquot_alloc_inode(const struct inode *inode, qsize_t number)
> warn_put_all:
> spin_unlock(&dq_data_lock);
> if (ret == QUOTA_OK)
> - /* Dirtify all the dquots - this can block when journalling */
> - for (cnt = 0; cnt < MAXQUOTAS; cnt++)
> - if (inode->i_dquot[cnt])
> - mark_dquot_dirty(inode->i_dquot[cnt]);
> + mark_all_dquot_dirty((struct dquot **)inode->i_dquot);
Removed the typecast here.
> @@ -1706,10 +1713,7 @@ int dquot_free_inode(const struct inode *inode, qsize_t number)
> dquot_decr_inodes(inode->i_dquot[cnt], number);
> }
> spin_unlock(&dq_data_lock);
> - /* Dirtify all the dquots - this can block when journalling */
> - for (cnt = 0; cnt < MAXQUOTAS; cnt++)
> - if (inode->i_dquot[cnt])
> - mark_dquot_dirty(inode->i_dquot[cnt]);
> + mark_all_dquot_dirty((struct dquot **)inode->i_dquot);
Removed the typecast here.
...
Merged the patch into my tree. Thanks.
Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 5/5] quota: handle IO errors in dquot_transfer()
2009-12-14 12:21 ` [PATCH 5/5] quota: handle IO errors in dquot_transfer() Dmitry Monakhov
@ 2009-12-14 18:41 ` Jan Kara
2009-12-14 22:43 ` Dmitry Monakhov
0 siblings, 1 reply; 14+ messages in thread
From: Jan Kara @ 2009-12-14 18:41 UTC (permalink / raw)
To: Dmitry Monakhov; +Cc: Jan Kara, linux-fsdevel
On Mon 14-12-09 15:21:16, Dmitry Monakhov wrote:
> transfer_to[cnt] = dqget() may returns NULL due to IO error.
> But NULL value in transfer_to[cnt] means a dquot transfer
> optimization. So after operation succeed inode will have new
> i_uid or i_gid but accounted to old dquot. This behaviour
> is differ from dquot_initialize(). Let's handle IO error from
> dqget() equally in all functions.
>
> In appliance to dquot_transfer() this means that we have to finish
> operation regardless to IO errors from dqget().
In principle, the patch is fine (see just about a bug below). But even
better would be if you converted dquot_transfer() to actually return real
return codes (0, -EDQUOT, -EIO...) and make it return EIO in case of IO
error. Then a filesystem can properly fail chown if quota transfer cannot
be performed. This is a larger project since filesystems using
dquot_transfer need to be changed and also other dquot_... functions need
to be converted to this calling convention for the sake of consistency. But
it would be a good cleanup.
...
> for (cnt = 0; cnt < MAXQUOTAS; cnt++) {
> - transfer_from[cnt] = NULL;
> - transfer_to[cnt] = NULL;
> warntype_to[cnt] = QUOTA_NL_NOWARN;
> + if (!valid[cnt])
> + continue;
> + transfer_to[cnt] = dqget(inode->i_sb, iattr->ia_uid, cnt);
This is wrong! You have to take ia_gid in case of GRPQUOTA.
> @@ -1767,9 +1769,11 @@ int dquot_transfer(struct inode *inode, struct iattr *iattr)
> space = cur_space + rsv_space;
> /* Build the transfer_from list and check the limits */
> for (cnt = 0; cnt < MAXQUOTAS; cnt++) {
> - if (!transfer_to[cnt])
> + if (!valid[cnt])
> continue;
> transfer_from[cnt] = inode->i_dquot[cnt];
> + if (!transfer_to[cnt])
> + continue;
> if (check_idq(transfer_to[cnt], 1, warntype_to + cnt) ==
> NO_QUOTA || check_bdq(transfer_to[cnt], space, 0,
> warntype_to + cnt) == NO_QUOTA)
> @@ -1783,10 +1787,15 @@ int dquot_transfer(struct inode *inode, struct iattr *iattr)
> /*
> * Skip changes for same uid or gid or for turned off quota-type.
> */
> - if (!transfer_to[cnt])
> + if (!valid[cnt])
> continue;
> + /*
> + * Due to IO error we might not have transfer_to[] or
> + * transfer_from[] structure. Nor than less the operation must
^^ I suppose this should be
"Nonetheless"
> + * being done regardless quota io errors.
> + */
> + inode->i_dquot[cnt] = transfer_to[cnt];
>
> - /* Due to IO error we might not have transfer_from[] structure */
> if (transfer_from[cnt]) {
> warntype_from_inodes[cnt] =
> info_idq_free(transfer_from[cnt], 1);
> @@ -1797,12 +1806,11 @@ int dquot_transfer(struct inode *inode, struct iattr *iattr)
> dquot_free_reserved_space(transfer_from[cnt],
> rsv_space);
> }
> -
> - dquot_incr_inodes(transfer_to[cnt], 1);
> - dquot_incr_space(transfer_to[cnt], cur_space);
> - dquot_resv_space(transfer_to[cnt], rsv_space);
> -
> - inode->i_dquot[cnt] = transfer_to[cnt];
> + if (transfer_to[cnt]) {
> + dquot_incr_inodes(transfer_to[cnt], 1);
> + dquot_incr_space(transfer_to[cnt], cur_space);
> + dquot_resv_space(transfer_to[cnt], rsv_space);
> + }
> }
> spin_unlock(&dq_data_lock);
> up_write(&sb_dqopt(inode->i_sb)->dqptr_sem);
> --
> 1.6.0.4
>
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 5/5] quota: handle IO errors in dquot_transfer()
2009-12-14 18:41 ` Jan Kara
@ 2009-12-14 22:43 ` Dmitry Monakhov
2009-12-15 11:49 ` Jan Kara
0 siblings, 1 reply; 14+ messages in thread
From: Dmitry Monakhov @ 2009-12-14 22:43 UTC (permalink / raw)
To: Jan Kara; +Cc: linux-fsdevel
Jan Kara <jack@suse.cz> writes:
> On Mon 14-12-09 15:21:16, Dmitry Monakhov wrote:
>> transfer_to[cnt] = dqget() may returns NULL due to IO error.
>> But NULL value in transfer_to[cnt] means a dquot transfer
>> optimization. So after operation succeed inode will have new
>> i_uid or i_gid but accounted to old dquot. This behaviour
>> is differ from dquot_initialize(). Let's handle IO error from
>> dqget() equally in all functions.
>>
>> In appliance to dquot_transfer() this means that we have to finish
>> operation regardless to IO errors from dqget().
> In principle, the patch is fine (see just about a bug below). But even
> better would be if you converted dquot_transfer() to actually return real
> return codes (0, -EDQUOT, -EIO...) and make it return EIO in case of IO
Actually we have following set of errors (0, -EDQUOT, -EIO, -ENOMEM, -ENOSPC)
And ENOSPC is more realistic than EIO or ENOMEM. But from other point of view
quota is some sort of fs meta-data, so we can not let it just *silently* fail
because of some error as it done at the moment.
> error. Then a filesystem can properly fail chown if quota transfer cannot
> be performed. This is a larger project since filesystems using
> dquot_transfer need to be changed and also other dquot_... functions need
> to be converted to this calling convention for the sake of consistency. But
> it would be a good cleanup.
I'll handle it.
>
> ...
>> for (cnt = 0; cnt < MAXQUOTAS; cnt++) {
>> - transfer_from[cnt] = NULL;
>> - transfer_to[cnt] = NULL;
>> warntype_to[cnt] = QUOTA_NL_NOWARN;
>> + if (!valid[cnt])
>> + continue;
>> + transfer_to[cnt] = dqget(inode->i_sb, iattr->ia_uid, cnt);
> This is wrong! You have to take ia_gid in case of GRPQUOTA.
>
>> @@ -1767,9 +1769,11 @@ int dquot_transfer(struct inode *inode, struct iattr *iattr)
>> space = cur_space + rsv_space;
>> /* Build the transfer_from list and check the limits */
>> for (cnt = 0; cnt < MAXQUOTAS; cnt++) {
>> - if (!transfer_to[cnt])
>> + if (!valid[cnt])
>> continue;
>> transfer_from[cnt] = inode->i_dquot[cnt];
>> + if (!transfer_to[cnt])
>> + continue;
>> if (check_idq(transfer_to[cnt], 1, warntype_to + cnt) ==
>> NO_QUOTA || check_bdq(transfer_to[cnt], space, 0,
>> warntype_to + cnt) == NO_QUOTA)
>> @@ -1783,10 +1787,15 @@ int dquot_transfer(struct inode *inode, struct iattr *iattr)
>> /*
>> * Skip changes for same uid or gid or for turned off quota-type.
>> */
>> - if (!transfer_to[cnt])
>> + if (!valid[cnt])
>> continue;
>> + /*
>> + * Due to IO error we might not have transfer_to[] or
>> + * transfer_from[] structure. Nor than less the operation must
> ^^ I suppose this should be
> "Nonetheless"
>> + * being done regardless quota io errors.
>> + */
>> + inode->i_dquot[cnt] = transfer_to[cnt];
>>
>> - /* Due to IO error we might not have transfer_from[] structure */
>> if (transfer_from[cnt]) {
>> warntype_from_inodes[cnt] =
>> info_idq_free(transfer_from[cnt], 1);
>> @@ -1797,12 +1806,11 @@ int dquot_transfer(struct inode *inode, struct iattr *iattr)
>> dquot_free_reserved_space(transfer_from[cnt],
>> rsv_space);
>> }
>> -
>> - dquot_incr_inodes(transfer_to[cnt], 1);
>> - dquot_incr_space(transfer_to[cnt], cur_space);
>> - dquot_resv_space(transfer_to[cnt], rsv_space);
>> -
>> - inode->i_dquot[cnt] = transfer_to[cnt];
>> + if (transfer_to[cnt]) {
>> + dquot_incr_inodes(transfer_to[cnt], 1);
>> + dquot_incr_space(transfer_to[cnt], cur_space);
>> + dquot_resv_space(transfer_to[cnt], rsv_space);
>> + }
>> }
>> spin_unlock(&dq_data_lock);
>> up_write(&sb_dqopt(inode->i_sb)->dqptr_sem);
>> --
>> 1.6.0.4
>>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 5/5] quota: handle IO errors in dquot_transfer()
2009-12-14 22:43 ` Dmitry Monakhov
@ 2009-12-15 11:49 ` Jan Kara
0 siblings, 0 replies; 14+ messages in thread
From: Jan Kara @ 2009-12-15 11:49 UTC (permalink / raw)
To: Dmitry Monakhov; +Cc: Jan Kara, linux-fsdevel
On Tue 15-12-09 01:43:18, Dmitry Monakhov wrote:
> Jan Kara <jack@suse.cz> writes:
>
> > On Mon 14-12-09 15:21:16, Dmitry Monakhov wrote:
> >> transfer_to[cnt] = dqget() may returns NULL due to IO error.
> >> But NULL value in transfer_to[cnt] means a dquot transfer
> >> optimization. So after operation succeed inode will have new
> >> i_uid or i_gid but accounted to old dquot. This behaviour
> >> is differ from dquot_initialize(). Let's handle IO error from
> >> dqget() equally in all functions.
> >>
> >> In appliance to dquot_transfer() this means that we have to finish
> >> operation regardless to IO errors from dqget().
> > In principle, the patch is fine (see just about a bug below). But even
> > better would be if you converted dquot_transfer() to actually return real
> > return codes (0, -EDQUOT, -EIO...) and make it return EIO in case of IO
> Actually we have following set of errors (0, -EDQUOT, -EIO, -ENOMEM, -ENOSPC)
> And ENOSPC is more realistic than EIO or ENOMEM. But from other point of view
> quota is some sort of fs meta-data, so we can not let it just *silently* fail
> because of some error as it done at the moment.
Exactly. That's why I'd like to get this cleaned up in the long run...
Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2009-12-15 11:49 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-12-14 12:21 [PATCH 1/5] Add unlocked version of inode_add_bytes() function Dmitry Monakhov
2009-12-14 12:21 ` [PATCH 2/5] quota: decouple fs reserved space from quota reservation. [V5] Dmitry Monakhov
2009-12-14 12:21 ` [PATCH 3/5] ext4: Convert to generic reserved quota's space management. [V5] Dmitry Monakhov
2009-12-14 12:21 ` [PATCH 4/5] quota: Move duplicated code to separate functions Dmitry Monakhov
2009-12-14 12:21 ` [PATCH 5/5] quota: handle IO errors in dquot_transfer() Dmitry Monakhov
2009-12-14 18:41 ` Jan Kara
2009-12-14 22:43 ` Dmitry Monakhov
2009-12-15 11:49 ` Jan Kara
2009-12-14 18:00 ` [PATCH 4/5] quota: Move duplicated code to separate functions Jan Kara
2009-12-14 14:35 ` [PATCH 3/5] ext4: Convert to generic reserved quota's space management. [V5] tytso
2009-12-14 17:26 ` Jan Kara
2009-12-14 17:25 ` Jan Kara
2009-12-14 17:24 ` [PATCH 2/5] quota: decouple fs reserved space from quota reservation. [V5] Jan Kara
2009-12-14 17:21 ` [PATCH 1/5] Add unlocked version of inode_add_bytes() function Jan Kara
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).