* [PATCH 1/5] Add unlocked version of inode_add_bytes() function @ 2009-12-14 12:21 Dmitry Monakhov 2009-12-14 12:21 ` [PATCH 2/5] quota: decouple fs reserved space from quota reservation. [V5] Dmitry Monakhov 2009-12-14 17:21 ` [PATCH 1/5] Add unlocked version of inode_add_bytes() function Jan Kara 0 siblings, 2 replies; 14+ messages in thread From: Dmitry Monakhov @ 2009-12-14 12:21 UTC (permalink / raw) To: Jan Kara; +Cc: linux-fsdevel, Dmitry Monakhov Quota code requires unlocked version of this function. Off course we can just copy-paste the code, but copy-pasting is always an evil. Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> --- fs/stat.c | 10 ++++++++-- include/linux/fs.h | 1 + 2 files changed, 9 insertions(+), 2 deletions(-) diff --git a/fs/stat.c b/fs/stat.c index 075694e..c4ecd52 100644 --- a/fs/stat.c +++ b/fs/stat.c @@ -401,9 +401,9 @@ SYSCALL_DEFINE4(fstatat64, int, dfd, char __user *, filename, } #endif /* __ARCH_WANT_STAT64 */ -void inode_add_bytes(struct inode *inode, loff_t bytes) +/* Caller is here responsible for sufficient locking (ie. inode->i_lock) */ +void __inode_add_bytes(struct inode *inode, loff_t bytes) { - spin_lock(&inode->i_lock); inode->i_blocks += bytes >> 9; bytes &= 511; inode->i_bytes += bytes; @@ -411,6 +411,12 @@ void inode_add_bytes(struct inode *inode, loff_t bytes) inode->i_blocks++; inode->i_bytes -= 512; } +} + +void inode_add_bytes(struct inode *inode, loff_t bytes) +{ + spin_lock(&inode->i_lock); + __inode_add_bytes(inode, bytes); spin_unlock(&inode->i_lock); } diff --git a/include/linux/fs.h b/include/linux/fs.h index 2620a8c..98ea200 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2314,6 +2314,7 @@ extern const struct inode_operations page_symlink_inode_operations; extern int generic_readlink(struct dentry *, char __user *, int); extern void generic_fillattr(struct inode *, struct kstat *); extern int vfs_getattr(struct vfsmount *, struct dentry *, struct kstat *); +void __inode_add_bytes(struct inode *inode, loff_t bytes); void inode_add_bytes(struct inode *inode, loff_t bytes); void inode_sub_bytes(struct inode *inode, loff_t bytes); loff_t inode_get_bytes(struct inode *inode); -- 1.6.0.4 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 2/5] quota: decouple fs reserved space from quota reservation. [V5] 2009-12-14 12:21 [PATCH 1/5] Add unlocked version of inode_add_bytes() function Dmitry Monakhov @ 2009-12-14 12:21 ` Dmitry Monakhov 2009-12-14 12:21 ` [PATCH 3/5] ext4: Convert to generic reserved quota's space management. [V5] Dmitry Monakhov 2009-12-14 17:24 ` [PATCH 2/5] quota: decouple fs reserved space from quota reservation. [V5] Jan Kara 2009-12-14 17:21 ` [PATCH 1/5] Add unlocked version of inode_add_bytes() function Jan Kara 1 sibling, 2 replies; 14+ messages in thread From: Dmitry Monakhov @ 2009-12-14 12:21 UTC (permalink / raw) To: Jan Kara; +Cc: linux-fsdevel, Dmitry Monakhov, linux-ext4 Currently inode_reservation is managed by fs itself and this reservation is transfered on dquot_transfer(). This means what inode_reservation must always be in sync with dquot->dq_dqb.dqb_rsvspace. Otherwise dquot_transfer() will result in incorrect quota(WARN_ON in dquot_claim_reserved_space() will be triggered) This is not easy because of complex locking order issues for example http://bugzilla.kernel.org/show_bug.cgi?id=14739 The patch introduce quota reservation field for each fs-inode (fs specific inode is used in order to prevent bloating generic vfs inode). This reservation is managed by quota code internally similar to i_blocks/i_bytes and may not be always in sync with internal fs reservation. Also perform some code rearrangement: - Unify dquot_reserve_space() and dquot_reserve_space() - Unify dquot_release_reserved_space() and dquot_free_space() - Also this patch add missing warning update to release_rsv() dquot_release_reserved_space() must call flush_warnings() as dquot_free_space() does. Changes from V4 - fixes and cleanups according to Jan's comments. Changes from V3: - fix deadlock in dquota_alloc_space in journalled mode with nodelalloc Changes from V1: - move qutoa_reservation field from vfs_inode to fs_inode - account reservation in bytes instead of blocks. Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> --- fs/quota/dquot.c | 213 +++++++++++++++++++++++++++---------------------- include/linux/quota.h | 5 +- 2 files changed, 122 insertions(+), 96 deletions(-) diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c index 39b49c4..942ea0b 100644 --- a/fs/quota/dquot.c +++ b/fs/quota/dquot.c @@ -1388,6 +1388,67 @@ void vfs_dq_drop(struct inode *inode) EXPORT_SYMBOL(vfs_dq_drop); /* + * inode_reserved_space is managed internally by quota, and protected by + * i_lock similar to i_blocks+i_bytes. + */ +static qsize_t *inode_reserved_space(struct inode * inode) +{ + /* Filesystem must explicitly define it's own method in order to use + * quota reservation interface */ + BUG_ON(!inode->i_sb->dq_op->get_reserved_space); + return inode->i_sb->dq_op->get_reserved_space(inode); +} + +static void inode_add_rsv_space(struct inode *inode, qsize_t number) +{ + spin_lock(&inode->i_lock); + *inode_reserved_space(inode) += number; + spin_unlock(&inode->i_lock); +} + + +static void inode_claim_rsv_space(struct inode *inode, qsize_t number) +{ + spin_lock(&inode->i_lock); + *inode_reserved_space(inode) -= number; + __inode_add_bytes(inode, number); + spin_unlock(&inode->i_lock); +} + +static void inode_sub_rsv_space(struct inode *inode, qsize_t number) +{ + spin_lock(&inode->i_lock); + *inode_reserved_space(inode) -= number; + spin_unlock(&inode->i_lock); +} + +static qsize_t inode_get_rsv_space(struct inode *inode) +{ + qsize_t ret; + spin_lock(&inode->i_lock); + ret = *inode_reserved_space(inode); + spin_unlock(&inode->i_lock); + return ret; +} + +static void inode_incr_space(struct inode *inode, qsize_t number, + int reserve) +{ + if (reserve) + inode_add_rsv_space(inode, number); + else + inode_add_bytes(inode, number); +} + +static void inode_decr_space(struct inode *inode, qsize_t number, int reserve) +{ + if (reserve) + inode_sub_rsv_space(inode, number); + else + inode_sub_bytes(inode, number); +} + +/* * Following four functions update i_blocks+i_bytes fields and * quota information (together with appropriate checks) * NOTE: We absolutely rely on the fact that caller dirties @@ -1405,6 +1466,21 @@ int __dquot_alloc_space(struct inode *inode, qsize_t number, int cnt, ret = QUOTA_OK; char warntype[MAXQUOTAS]; + /* + * First test before acquiring mutex - solves deadlocks when we + * re-enter the quota code and are already holding the mutex + */ + if (IS_NOQUOTA(inode)) { + inode_incr_space(inode, number, reserve); + goto out; + } + + down_read(&sb_dqopt(inode->i_sb)->dqptr_sem); + if (IS_NOQUOTA(inode)) { + inode_incr_space(inode, number, reserve); + goto out_unlock; + } + for (cnt = 0; cnt < MAXQUOTAS; cnt++) warntype[cnt] = QUOTA_NL_NOWARN; @@ -1415,7 +1491,8 @@ int __dquot_alloc_space(struct inode *inode, qsize_t number, if (check_bdq(inode->i_dquot[cnt], number, warn, warntype+cnt) == NO_QUOTA) { ret = NO_QUOTA; - goto out_unlock; + spin_unlock(&dq_data_lock); + goto out_flush_warn; } } for (cnt = 0; cnt < MAXQUOTAS; cnt++) { @@ -1426,64 +1503,32 @@ int __dquot_alloc_space(struct inode *inode, qsize_t number, else dquot_incr_space(inode->i_dquot[cnt], number); } - if (!reserve) - inode_add_bytes(inode, number); -out_unlock: + inode_incr_space(inode, number, reserve); spin_unlock(&dq_data_lock); - flush_warnings(inode->i_dquot, warntype); - return ret; -} - -int dquot_alloc_space(struct inode *inode, qsize_t number, int warn) -{ - int cnt, ret = QUOTA_OK; - - /* - * First test before acquiring mutex - solves deadlocks when we - * re-enter the quota code and are already holding the mutex - */ - if (IS_NOQUOTA(inode)) { - inode_add_bytes(inode, number); - goto out; - } - - down_read(&sb_dqopt(inode->i_sb)->dqptr_sem); - if (IS_NOQUOTA(inode)) { - inode_add_bytes(inode, number); - goto out_unlock; - } - - ret = __dquot_alloc_space(inode, number, warn, 0); - if (ret == NO_QUOTA) - goto out_unlock; + if (reserve) + goto out_flush_warn; /* Dirtify all the dquots - this can block when journalling */ for (cnt = 0; cnt < MAXQUOTAS; cnt++) if (inode->i_dquot[cnt]) mark_dquot_dirty(inode->i_dquot[cnt]); +out_flush_warn: + flush_warnings(inode->i_dquot, warntype); out_unlock: up_read(&sb_dqopt(inode->i_sb)->dqptr_sem); out: return ret; } + +int dquot_alloc_space(struct inode *inode, qsize_t number, int warn) +{ + return __dquot_alloc_space(inode, number, warn, 0); +} EXPORT_SYMBOL(dquot_alloc_space); int dquot_reserve_space(struct inode *inode, qsize_t number, int warn) { - int ret = QUOTA_OK; - - if (IS_NOQUOTA(inode)) - goto out; - - down_read(&sb_dqopt(inode->i_sb)->dqptr_sem); - if (IS_NOQUOTA(inode)) - goto out_unlock; - - ret = __dquot_alloc_space(inode, number, warn, 1); -out_unlock: - up_read(&sb_dqopt(inode->i_sb)->dqptr_sem); -out: - return ret; + return __dquot_alloc_space(inode, number, warn, 1); } EXPORT_SYMBOL(dquot_reserve_space); @@ -1540,14 +1585,14 @@ int dquot_claim_space(struct inode *inode, qsize_t number) int ret = QUOTA_OK; if (IS_NOQUOTA(inode)) { - inode_add_bytes(inode, number); + inode_claim_rsv_space(inode, number); goto out; } down_read(&sb_dqopt(inode->i_sb)->dqptr_sem); if (IS_NOQUOTA(inode)) { up_read(&sb_dqopt(inode->i_sb)->dqptr_sem); - inode_add_bytes(inode, number); + inode_claim_rsv_space(inode, number); goto out; } @@ -1559,7 +1604,7 @@ int dquot_claim_space(struct inode *inode, qsize_t number) number); } /* Update inode bytes */ - inode_add_bytes(inode, number); + inode_claim_rsv_space(inode, number); spin_unlock(&dq_data_lock); /* Dirtify all the dquots - this can block when journalling */ for (cnt = 0; cnt < MAXQUOTAS; cnt++) @@ -1572,38 +1617,9 @@ out: EXPORT_SYMBOL(dquot_claim_space); /* - * Release reserved quota space - */ -void dquot_release_reserved_space(struct inode *inode, qsize_t number) -{ - int cnt; - - if (IS_NOQUOTA(inode)) - goto out; - - down_read(&sb_dqopt(inode->i_sb)->dqptr_sem); - if (IS_NOQUOTA(inode)) - goto out_unlock; - - spin_lock(&dq_data_lock); - /* Release reserved dquots */ - for (cnt = 0; cnt < MAXQUOTAS; cnt++) { - if (inode->i_dquot[cnt]) - dquot_free_reserved_space(inode->i_dquot[cnt], number); - } - spin_unlock(&dq_data_lock); - -out_unlock: - up_read(&sb_dqopt(inode->i_sb)->dqptr_sem); -out: - return; -} -EXPORT_SYMBOL(dquot_release_reserved_space); - -/* * This operation can block, but only after everything is updated */ -int dquot_free_space(struct inode *inode, qsize_t number) +int __dquot_free_space(struct inode *inode, qsize_t number, int reserve) { unsigned int cnt; char warntype[MAXQUOTAS]; @@ -1612,7 +1628,7 @@ int dquot_free_space(struct inode *inode, qsize_t number) * re-enter the quota code and are already holding the mutex */ if (IS_NOQUOTA(inode)) { out_sub: - inode_sub_bytes(inode, number); + inode_decr_space(inode, number, reserve); return QUOTA_OK; } @@ -1627,21 +1643,43 @@ out_sub: if (!inode->i_dquot[cnt]) continue; warntype[cnt] = info_bdq_free(inode->i_dquot[cnt], number); - dquot_decr_space(inode->i_dquot[cnt], number); + if (reserve) + dquot_free_reserved_space(inode->i_dquot[cnt], number); + else + dquot_decr_space(inode->i_dquot[cnt], number); } - inode_sub_bytes(inode, number); + inode_decr_space(inode, number, reserve); spin_unlock(&dq_data_lock); + + if (reserve) + goto out_unlock; /* Dirtify all the dquots - this can block when journalling */ for (cnt = 0; cnt < MAXQUOTAS; cnt++) if (inode->i_dquot[cnt]) mark_dquot_dirty(inode->i_dquot[cnt]); +out_unlock: flush_warnings(inode->i_dquot, warntype); up_read(&sb_dqopt(inode->i_sb)->dqptr_sem); return QUOTA_OK; } + +int dquot_free_space(struct inode *inode, qsize_t number) +{ + return __dquot_free_space(inode, number, 0); +} EXPORT_SYMBOL(dquot_free_space); /* + * Release reserved quota space + */ +void dquot_release_reserved_space(struct inode *inode, qsize_t number) +{ + return (void )__dquot_free_space(inode, number, 1); + +} +EXPORT_SYMBOL(dquot_release_reserved_space); + +/* * This operation can block, but only after everything is updated */ int dquot_free_inode(const struct inode *inode, qsize_t number) @@ -1679,19 +1717,6 @@ int dquot_free_inode(const struct inode *inode, qsize_t number) EXPORT_SYMBOL(dquot_free_inode); /* - * call back function, get reserved quota space from underlying fs - */ -qsize_t dquot_get_reserved_space(struct inode *inode) -{ - qsize_t reserved_space = 0; - - if (sb_any_quota_active(inode->i_sb) && - inode->i_sb->dq_op->get_reserved_space) - reserved_space = inode->i_sb->dq_op->get_reserved_space(inode); - return reserved_space; -} - -/* * Transfer the number of inode and blocks from one diskquota to an other. * * This operation can block, but only after everything is updated @@ -1734,7 +1759,7 @@ int dquot_transfer(struct inode *inode, struct iattr *iattr) } spin_lock(&dq_data_lock); cur_space = inode_get_bytes(inode); - rsv_space = dquot_get_reserved_space(inode); + rsv_space = inode_get_rsv_space(inode); space = cur_space + rsv_space; /* Build the transfer_from list and check the limits */ for (cnt = 0; cnt < MAXQUOTAS; cnt++) { diff --git a/include/linux/quota.h b/include/linux/quota.h index 78c4889..8fd8efc 100644 --- a/include/linux/quota.h +++ b/include/linux/quota.h @@ -313,8 +313,9 @@ struct dquot_operations { int (*claim_space) (struct inode *, qsize_t); /* release rsved quota for delayed alloc */ void (*release_rsv) (struct inode *, qsize_t); - /* get reserved quota for delayed alloc */ - qsize_t (*get_reserved_space) (struct inode *); + /* get reserved quota for delayed alloc, value returned is managed by + * quota code only */ + qsize_t *(*get_reserved_space) (struct inode *); }; /* Operations handling requests from userspace */ -- 1.6.0.4 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 3/5] ext4: Convert to generic reserved quota's space management. [V5] 2009-12-14 12:21 ` [PATCH 2/5] quota: decouple fs reserved space from quota reservation. [V5] Dmitry Monakhov @ 2009-12-14 12:21 ` Dmitry Monakhov 2009-12-14 12:21 ` [PATCH 4/5] quota: Move duplicated code to separate functions Dmitry Monakhov ` (2 more replies) 2009-12-14 17:24 ` [PATCH 2/5] quota: decouple fs reserved space from quota reservation. [V5] Jan Kara 1 sibling, 3 replies; 14+ messages in thread From: Dmitry Monakhov @ 2009-12-14 12:21 UTC (permalink / raw) To: Jan Kara; +Cc: linux-fsdevel, Dmitry Monakhov, linux-ext4 This patch fix write vs chown race condition. Changes from V4 - Coding style cleanup according to Jan's comment. Changes from V2: - add missed i_reserved_quota iniatilization. Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> --- fs/ext4/ext4.h | 6 +++++- fs/ext4/inode.c | 16 +++++++--------- fs/ext4/super.c | 3 +++ 3 files changed, 15 insertions(+), 10 deletions(-) diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index 26d3cf8..708496f 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -705,6 +705,10 @@ struct ext4_inode_info { struct list_head i_aio_dio_complete_list; /* current io_end structure for async DIO write*/ ext4_io_end_t *cur_aio_dio; +#ifdef CONFIG_QUOTA + /* quota space reservation, managed internally by quota code */ + qsize_t i_reserved_quota; +#endif }; /* @@ -1427,7 +1431,7 @@ extern int ext4_chunk_trans_blocks(struct inode *, int nrblocks); extern int ext4_block_truncate_page(handle_t *handle, struct address_space *mapping, loff_t from); extern int ext4_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf); -extern qsize_t ext4_get_reserved_space(struct inode *inode); +extern qsize_t *ext4_get_reserved_space(struct inode *inode); extern int flush_aio_dio_completed_IO(struct inode *inode); /* ioctl.c */ extern long ext4_ioctl(struct file *, unsigned int, unsigned long); diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index d99173a..8254fe6 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -1048,17 +1048,12 @@ out: return err; } -qsize_t ext4_get_reserved_space(struct inode *inode) +#ifdef CONFIG_QUOTA +qsize_t *ext4_get_reserved_space(struct inode *inode) { - unsigned long long total; - - spin_lock(&EXT4_I(inode)->i_block_reservation_lock); - total = EXT4_I(inode)->i_reserved_data_blocks + - EXT4_I(inode)->i_reserved_meta_blocks; - spin_unlock(&EXT4_I(inode)->i_block_reservation_lock); - - return (total << inode->i_blkbits); + return &EXT4_I(inode)->i_reserved_quota; } +#endif /* * Calculate the number of metadata blocks need to reserve * to allocate @blocks for non extent file based file @@ -5048,6 +5043,9 @@ struct inode *ext4_iget(struct super_block *sb, unsigned long ino) ((__u64)le16_to_cpu(raw_inode->i_file_acl_high)) << 32; inode->i_size = ext4_isize(raw_inode); ei->i_disksize = inode->i_size; +#ifdef CONFIG_QUOTA + ei->i_reserved_quota = 0; +#endif inode->i_generation = le32_to_cpu(raw_inode->i_generation); ei->i_block_group = iloc.block_group; ei->i_last_alloc_group = ~0; diff --git a/fs/ext4/super.c b/fs/ext4/super.c index 4c87d97..4697272 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -713,6 +713,9 @@ static struct inode *ext4_alloc_inode(struct super_block *sb) spin_lock_init(&(ei->i_block_reservation_lock)); INIT_LIST_HEAD(&ei->i_aio_dio_complete_list); ei->cur_aio_dio = NULL; +#ifdef CONFIG_QUOTA + ei->i_reserved_quota = 0; +#endif return &ei->vfs_inode; } -- 1.6.0.4 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 4/5] quota: Move duplicated code to separate functions 2009-12-14 12:21 ` [PATCH 3/5] ext4: Convert to generic reserved quota's space management. [V5] Dmitry Monakhov @ 2009-12-14 12:21 ` Dmitry Monakhov 2009-12-14 12:21 ` [PATCH 5/5] quota: handle IO errors in dquot_transfer() Dmitry Monakhov 2009-12-14 18:00 ` [PATCH 4/5] quota: Move duplicated code to separate functions Jan Kara 2009-12-14 14:35 ` [PATCH 3/5] ext4: Convert to generic reserved quota's space management. [V5] tytso 2009-12-14 17:25 ` Jan Kara 2 siblings, 2 replies; 14+ messages in thread From: Dmitry Monakhov @ 2009-12-14 12:21 UTC (permalink / raw) To: Jan Kara; +Cc: linux-fsdevel, Dmitry Monakhov - for(..) { mark_dquot_dirty(); } -> mark_all_dquot_dirty() - for(..) { dput_all(); } -> dqput_all() Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> --- fs/quota/dquot.c | 75 ++++++++++++++++++++++++++---------------------------- 1 files changed, 36 insertions(+), 39 deletions(-) diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c index 942ea0b..d6d535c 100644 --- a/fs/quota/dquot.c +++ b/fs/quota/dquot.c @@ -327,6 +327,28 @@ int dquot_mark_dquot_dirty(struct dquot *dquot) } EXPORT_SYMBOL(dquot_mark_dquot_dirty); +/* Dirtify all the dquots - this can block when journalling */ +static inline int mark_all_dquot_dirty(struct dquot **dquot) +{ + int ret, err, cnt; + ret = err = 0; + for (cnt = 0; cnt < MAXQUOTAS; cnt++) { + if (dquot[cnt]) + /* Even in case of error we have to continue */ + ret = mark_dquot_dirty(dquot[cnt]); + if (!err) + err = ret; + } + return err; +} + +static inline void dqput_all(struct dquot **dquot) +{ + unsigned int cnt; + for (cnt = 0; cnt < MAXQUOTAS; cnt++) + dqput(dquot[cnt]); +} + /* This function needs dq_list_lock */ static inline int clear_dquot_dirty(struct dquot *dquot) { @@ -1337,8 +1359,7 @@ int dquot_initialize(struct inode *inode, int type) out_err: up_write(&sb_dqopt(sb)->dqptr_sem); /* Drop unused references */ - for (cnt = 0; cnt < MAXQUOTAS; cnt++) - dqput(got[cnt]); + dqput_all(got); return ret; } EXPORT_SYMBOL(dquot_initialize); @@ -1357,9 +1378,7 @@ int dquot_drop(struct inode *inode) inode->i_dquot[cnt] = NULL; } up_write(&sb_dqopt(inode->i_sb)->dqptr_sem); - - for (cnt = 0; cnt < MAXQUOTAS; cnt++) - dqput(put[cnt]); + dqput_all(put); return 0; } EXPORT_SYMBOL(dquot_drop); @@ -1508,10 +1527,7 @@ int __dquot_alloc_space(struct inode *inode, qsize_t number, if (reserve) goto out_flush_warn; - /* Dirtify all the dquots - this can block when journalling */ - for (cnt = 0; cnt < MAXQUOTAS; cnt++) - if (inode->i_dquot[cnt]) - mark_dquot_dirty(inode->i_dquot[cnt]); + mark_all_dquot_dirty(inode->i_dquot); out_flush_warn: flush_warnings(inode->i_dquot, warntype); out_unlock: @@ -1569,10 +1585,7 @@ int dquot_alloc_inode(const struct inode *inode, qsize_t number) warn_put_all: spin_unlock(&dq_data_lock); if (ret == QUOTA_OK) - /* Dirtify all the dquots - this can block when journalling */ - for (cnt = 0; cnt < MAXQUOTAS; cnt++) - if (inode->i_dquot[cnt]) - mark_dquot_dirty(inode->i_dquot[cnt]); + mark_all_dquot_dirty((struct dquot **)inode->i_dquot); flush_warnings(inode->i_dquot, warntype); up_read(&sb_dqopt(inode->i_sb)->dqptr_sem); return ret; @@ -1606,10 +1619,7 @@ int dquot_claim_space(struct inode *inode, qsize_t number) /* Update inode bytes */ inode_claim_rsv_space(inode, number); spin_unlock(&dq_data_lock); - /* Dirtify all the dquots - this can block when journalling */ - for (cnt = 0; cnt < MAXQUOTAS; cnt++) - if (inode->i_dquot[cnt]) - mark_dquot_dirty(inode->i_dquot[cnt]); + mark_all_dquot_dirty(inode->i_dquot); up_read(&sb_dqopt(inode->i_sb)->dqptr_sem); out: return ret; @@ -1653,10 +1663,7 @@ out_sub: if (reserve) goto out_unlock; - /* Dirtify all the dquots - this can block when journalling */ - for (cnt = 0; cnt < MAXQUOTAS; cnt++) - if (inode->i_dquot[cnt]) - mark_dquot_dirty(inode->i_dquot[cnt]); + mark_all_dquot_dirty(inode->i_dquot); out_unlock: flush_warnings(inode->i_dquot, warntype); up_read(&sb_dqopt(inode->i_sb)->dqptr_sem); @@ -1706,10 +1713,7 @@ int dquot_free_inode(const struct inode *inode, qsize_t number) dquot_decr_inodes(inode->i_dquot[cnt], number); } spin_unlock(&dq_data_lock); - /* Dirtify all the dquots - this can block when journalling */ - for (cnt = 0; cnt < MAXQUOTAS; cnt++) - if (inode->i_dquot[cnt]) - mark_dquot_dirty(inode->i_dquot[cnt]); + mark_all_dquot_dirty((struct dquot **)inode->i_dquot); flush_warnings(inode->i_dquot, warntype); up_read(&sb_dqopt(inode->i_sb)->dqptr_sem); return QUOTA_OK; @@ -1803,25 +1807,18 @@ int dquot_transfer(struct inode *inode, struct iattr *iattr) spin_unlock(&dq_data_lock); up_write(&sb_dqopt(inode->i_sb)->dqptr_sem); - /* Dirtify all the dquots - this can block when journalling */ - for (cnt = 0; cnt < MAXQUOTAS; cnt++) { - if (transfer_from[cnt]) - mark_dquot_dirty(transfer_from[cnt]); - if (transfer_to[cnt]) { - mark_dquot_dirty(transfer_to[cnt]); - /* The reference we got is transferred to the inode */ - transfer_to[cnt] = NULL; - } - } + mark_all_dquot_dirty(transfer_from); + mark_all_dquot_dirty(transfer_to); + /* The reference we got is transferred to the inode */ + for (cnt = 0; cnt < MAXQUOTAS; cnt++) + transfer_to[cnt] = NULL; warn_put_all: flush_warnings(transfer_to, warntype_to); flush_warnings(transfer_from, warntype_from_inodes); flush_warnings(transfer_from, warntype_from_space); put_all: - for (cnt = 0; cnt < MAXQUOTAS; cnt++) { - dqput(transfer_from[cnt]); - dqput(transfer_to[cnt]); - } + dqput_all(transfer_from); + dqput_all(transfer_to); return ret; over_quota: spin_unlock(&dq_data_lock); -- 1.6.0.4 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 5/5] quota: handle IO errors in dquot_transfer() 2009-12-14 12:21 ` [PATCH 4/5] quota: Move duplicated code to separate functions Dmitry Monakhov @ 2009-12-14 12:21 ` Dmitry Monakhov 2009-12-14 18:41 ` Jan Kara 2009-12-14 18:00 ` [PATCH 4/5] quota: Move duplicated code to separate functions Jan Kara 1 sibling, 1 reply; 14+ messages in thread From: Dmitry Monakhov @ 2009-12-14 12:21 UTC (permalink / raw) To: Jan Kara; +Cc: linux-fsdevel, Dmitry Monakhov transfer_to[cnt] = dqget() may returns NULL due to IO error. But NULL value in transfer_to[cnt] means a dquot transfer optimization. So after operation succeed inode will have new i_uid or i_gid but accounted to old dquot. This behaviour is differ from dquot_initialize(). Let's handle IO error from dqget() equally in all functions. In appliance to dquot_transfer() this means that we have to finish operation regardless to IO errors from dqget(). Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> --- fs/quota/dquot.c | 46 +++++++++++++++++++++++++++------------------- 1 files changed, 27 insertions(+), 19 deletions(-) diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c index d6d535c..fbbaa4e 100644 --- a/fs/quota/dquot.c +++ b/fs/quota/dquot.c @@ -1733,27 +1733,29 @@ int dquot_transfer(struct inode *inode, struct iattr *iattr) struct dquot *transfer_from[MAXQUOTAS]; struct dquot *transfer_to[MAXQUOTAS]; int cnt, ret = QUOTA_OK; - int chuid = iattr->ia_valid & ATTR_UID && inode->i_uid != iattr->ia_uid, - chgid = iattr->ia_valid & ATTR_GID && inode->i_gid != iattr->ia_gid; char warntype_to[MAXQUOTAS]; char warntype_from_inodes[MAXQUOTAS], warntype_from_space[MAXQUOTAS]; + char valid[MAXQUOTAS]; /* First test before acquiring mutex - solves deadlocks when we * re-enter the quota code and are already holding the mutex */ if (IS_NOQUOTA(inode)) return QUOTA_OK; /* Initialize the arrays */ + memset(transfer_to, 0, sizeof(transfer_to)); + memset(transfer_from, 0, sizeof(transfer_from)); + memset(valid, 0, sizeof(valid)); + valid[USRQUOTA] = iattr->ia_valid & ATTR_UID && + inode->i_uid != iattr->ia_uid, + valid[GRPQUOTA] = iattr->ia_valid & ATTR_GID && + inode->i_gid != iattr->ia_gid; + for (cnt = 0; cnt < MAXQUOTAS; cnt++) { - transfer_from[cnt] = NULL; - transfer_to[cnt] = NULL; warntype_to[cnt] = QUOTA_NL_NOWARN; + if (!valid[cnt]) + continue; + transfer_to[cnt] = dqget(inode->i_sb, iattr->ia_uid, cnt); } - if (chuid) - transfer_to[USRQUOTA] = dqget(inode->i_sb, iattr->ia_uid, - USRQUOTA); - if (chgid) - transfer_to[GRPQUOTA] = dqget(inode->i_sb, iattr->ia_gid, - GRPQUOTA); down_write(&sb_dqopt(inode->i_sb)->dqptr_sem); /* Now recheck reliably when holding dqptr_sem */ @@ -1767,9 +1769,11 @@ int dquot_transfer(struct inode *inode, struct iattr *iattr) space = cur_space + rsv_space; /* Build the transfer_from list and check the limits */ for (cnt = 0; cnt < MAXQUOTAS; cnt++) { - if (!transfer_to[cnt]) + if (!valid[cnt]) continue; transfer_from[cnt] = inode->i_dquot[cnt]; + if (!transfer_to[cnt]) + continue; if (check_idq(transfer_to[cnt], 1, warntype_to + cnt) == NO_QUOTA || check_bdq(transfer_to[cnt], space, 0, warntype_to + cnt) == NO_QUOTA) @@ -1783,10 +1787,15 @@ int dquot_transfer(struct inode *inode, struct iattr *iattr) /* * Skip changes for same uid or gid or for turned off quota-type. */ - if (!transfer_to[cnt]) + if (!valid[cnt]) continue; + /* + * Due to IO error we might not have transfer_to[] or + * transfer_from[] structure. Nor than less the operation must + * being done regardless quota io errors. + */ + inode->i_dquot[cnt] = transfer_to[cnt]; - /* Due to IO error we might not have transfer_from[] structure */ if (transfer_from[cnt]) { warntype_from_inodes[cnt] = info_idq_free(transfer_from[cnt], 1); @@ -1797,12 +1806,11 @@ int dquot_transfer(struct inode *inode, struct iattr *iattr) dquot_free_reserved_space(transfer_from[cnt], rsv_space); } - - dquot_incr_inodes(transfer_to[cnt], 1); - dquot_incr_space(transfer_to[cnt], cur_space); - dquot_resv_space(transfer_to[cnt], rsv_space); - - inode->i_dquot[cnt] = transfer_to[cnt]; + if (transfer_to[cnt]) { + dquot_incr_inodes(transfer_to[cnt], 1); + dquot_incr_space(transfer_to[cnt], cur_space); + dquot_resv_space(transfer_to[cnt], rsv_space); + } } spin_unlock(&dq_data_lock); up_write(&sb_dqopt(inode->i_sb)->dqptr_sem); -- 1.6.0.4 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 5/5] quota: handle IO errors in dquot_transfer() 2009-12-14 12:21 ` [PATCH 5/5] quota: handle IO errors in dquot_transfer() Dmitry Monakhov @ 2009-12-14 18:41 ` Jan Kara 2009-12-14 22:43 ` Dmitry Monakhov 0 siblings, 1 reply; 14+ messages in thread From: Jan Kara @ 2009-12-14 18:41 UTC (permalink / raw) To: Dmitry Monakhov; +Cc: Jan Kara, linux-fsdevel On Mon 14-12-09 15:21:16, Dmitry Monakhov wrote: > transfer_to[cnt] = dqget() may returns NULL due to IO error. > But NULL value in transfer_to[cnt] means a dquot transfer > optimization. So after operation succeed inode will have new > i_uid or i_gid but accounted to old dquot. This behaviour > is differ from dquot_initialize(). Let's handle IO error from > dqget() equally in all functions. > > In appliance to dquot_transfer() this means that we have to finish > operation regardless to IO errors from dqget(). In principle, the patch is fine (see just about a bug below). But even better would be if you converted dquot_transfer() to actually return real return codes (0, -EDQUOT, -EIO...) and make it return EIO in case of IO error. Then a filesystem can properly fail chown if quota transfer cannot be performed. This is a larger project since filesystems using dquot_transfer need to be changed and also other dquot_... functions need to be converted to this calling convention for the sake of consistency. But it would be a good cleanup. ... > for (cnt = 0; cnt < MAXQUOTAS; cnt++) { > - transfer_from[cnt] = NULL; > - transfer_to[cnt] = NULL; > warntype_to[cnt] = QUOTA_NL_NOWARN; > + if (!valid[cnt]) > + continue; > + transfer_to[cnt] = dqget(inode->i_sb, iattr->ia_uid, cnt); This is wrong! You have to take ia_gid in case of GRPQUOTA. > @@ -1767,9 +1769,11 @@ int dquot_transfer(struct inode *inode, struct iattr *iattr) > space = cur_space + rsv_space; > /* Build the transfer_from list and check the limits */ > for (cnt = 0; cnt < MAXQUOTAS; cnt++) { > - if (!transfer_to[cnt]) > + if (!valid[cnt]) > continue; > transfer_from[cnt] = inode->i_dquot[cnt]; > + if (!transfer_to[cnt]) > + continue; > if (check_idq(transfer_to[cnt], 1, warntype_to + cnt) == > NO_QUOTA || check_bdq(transfer_to[cnt], space, 0, > warntype_to + cnt) == NO_QUOTA) > @@ -1783,10 +1787,15 @@ int dquot_transfer(struct inode *inode, struct iattr *iattr) > /* > * Skip changes for same uid or gid or for turned off quota-type. > */ > - if (!transfer_to[cnt]) > + if (!valid[cnt]) > continue; > + /* > + * Due to IO error we might not have transfer_to[] or > + * transfer_from[] structure. Nor than less the operation must ^^ I suppose this should be "Nonetheless" > + * being done regardless quota io errors. > + */ > + inode->i_dquot[cnt] = transfer_to[cnt]; > > - /* Due to IO error we might not have transfer_from[] structure */ > if (transfer_from[cnt]) { > warntype_from_inodes[cnt] = > info_idq_free(transfer_from[cnt], 1); > @@ -1797,12 +1806,11 @@ int dquot_transfer(struct inode *inode, struct iattr *iattr) > dquot_free_reserved_space(transfer_from[cnt], > rsv_space); > } > - > - dquot_incr_inodes(transfer_to[cnt], 1); > - dquot_incr_space(transfer_to[cnt], cur_space); > - dquot_resv_space(transfer_to[cnt], rsv_space); > - > - inode->i_dquot[cnt] = transfer_to[cnt]; > + if (transfer_to[cnt]) { > + dquot_incr_inodes(transfer_to[cnt], 1); > + dquot_incr_space(transfer_to[cnt], cur_space); > + dquot_resv_space(transfer_to[cnt], rsv_space); > + } > } > spin_unlock(&dq_data_lock); > up_write(&sb_dqopt(inode->i_sb)->dqptr_sem); > -- > 1.6.0.4 > -- Jan Kara <jack@suse.cz> SUSE Labs, CR ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 5/5] quota: handle IO errors in dquot_transfer() 2009-12-14 18:41 ` Jan Kara @ 2009-12-14 22:43 ` Dmitry Monakhov 2009-12-15 11:49 ` Jan Kara 0 siblings, 1 reply; 14+ messages in thread From: Dmitry Monakhov @ 2009-12-14 22:43 UTC (permalink / raw) To: Jan Kara; +Cc: linux-fsdevel Jan Kara <jack@suse.cz> writes: > On Mon 14-12-09 15:21:16, Dmitry Monakhov wrote: >> transfer_to[cnt] = dqget() may returns NULL due to IO error. >> But NULL value in transfer_to[cnt] means a dquot transfer >> optimization. So after operation succeed inode will have new >> i_uid or i_gid but accounted to old dquot. This behaviour >> is differ from dquot_initialize(). Let's handle IO error from >> dqget() equally in all functions. >> >> In appliance to dquot_transfer() this means that we have to finish >> operation regardless to IO errors from dqget(). > In principle, the patch is fine (see just about a bug below). But even > better would be if you converted dquot_transfer() to actually return real > return codes (0, -EDQUOT, -EIO...) and make it return EIO in case of IO Actually we have following set of errors (0, -EDQUOT, -EIO, -ENOMEM, -ENOSPC) And ENOSPC is more realistic than EIO or ENOMEM. But from other point of view quota is some sort of fs meta-data, so we can not let it just *silently* fail because of some error as it done at the moment. > error. Then a filesystem can properly fail chown if quota transfer cannot > be performed. This is a larger project since filesystems using > dquot_transfer need to be changed and also other dquot_... functions need > to be converted to this calling convention for the sake of consistency. But > it would be a good cleanup. I'll handle it. > > ... >> for (cnt = 0; cnt < MAXQUOTAS; cnt++) { >> - transfer_from[cnt] = NULL; >> - transfer_to[cnt] = NULL; >> warntype_to[cnt] = QUOTA_NL_NOWARN; >> + if (!valid[cnt]) >> + continue; >> + transfer_to[cnt] = dqget(inode->i_sb, iattr->ia_uid, cnt); > This is wrong! You have to take ia_gid in case of GRPQUOTA. > >> @@ -1767,9 +1769,11 @@ int dquot_transfer(struct inode *inode, struct iattr *iattr) >> space = cur_space + rsv_space; >> /* Build the transfer_from list and check the limits */ >> for (cnt = 0; cnt < MAXQUOTAS; cnt++) { >> - if (!transfer_to[cnt]) >> + if (!valid[cnt]) >> continue; >> transfer_from[cnt] = inode->i_dquot[cnt]; >> + if (!transfer_to[cnt]) >> + continue; >> if (check_idq(transfer_to[cnt], 1, warntype_to + cnt) == >> NO_QUOTA || check_bdq(transfer_to[cnt], space, 0, >> warntype_to + cnt) == NO_QUOTA) >> @@ -1783,10 +1787,15 @@ int dquot_transfer(struct inode *inode, struct iattr *iattr) >> /* >> * Skip changes for same uid or gid or for turned off quota-type. >> */ >> - if (!transfer_to[cnt]) >> + if (!valid[cnt]) >> continue; >> + /* >> + * Due to IO error we might not have transfer_to[] or >> + * transfer_from[] structure. Nor than less the operation must > ^^ I suppose this should be > "Nonetheless" >> + * being done regardless quota io errors. >> + */ >> + inode->i_dquot[cnt] = transfer_to[cnt]; >> >> - /* Due to IO error we might not have transfer_from[] structure */ >> if (transfer_from[cnt]) { >> warntype_from_inodes[cnt] = >> info_idq_free(transfer_from[cnt], 1); >> @@ -1797,12 +1806,11 @@ int dquot_transfer(struct inode *inode, struct iattr *iattr) >> dquot_free_reserved_space(transfer_from[cnt], >> rsv_space); >> } >> - >> - dquot_incr_inodes(transfer_to[cnt], 1); >> - dquot_incr_space(transfer_to[cnt], cur_space); >> - dquot_resv_space(transfer_to[cnt], rsv_space); >> - >> - inode->i_dquot[cnt] = transfer_to[cnt]; >> + if (transfer_to[cnt]) { >> + dquot_incr_inodes(transfer_to[cnt], 1); >> + dquot_incr_space(transfer_to[cnt], cur_space); >> + dquot_resv_space(transfer_to[cnt], rsv_space); >> + } >> } >> spin_unlock(&dq_data_lock); >> up_write(&sb_dqopt(inode->i_sb)->dqptr_sem); >> -- >> 1.6.0.4 >> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 5/5] quota: handle IO errors in dquot_transfer() 2009-12-14 22:43 ` Dmitry Monakhov @ 2009-12-15 11:49 ` Jan Kara 0 siblings, 0 replies; 14+ messages in thread From: Jan Kara @ 2009-12-15 11:49 UTC (permalink / raw) To: Dmitry Monakhov; +Cc: Jan Kara, linux-fsdevel On Tue 15-12-09 01:43:18, Dmitry Monakhov wrote: > Jan Kara <jack@suse.cz> writes: > > > On Mon 14-12-09 15:21:16, Dmitry Monakhov wrote: > >> transfer_to[cnt] = dqget() may returns NULL due to IO error. > >> But NULL value in transfer_to[cnt] means a dquot transfer > >> optimization. So after operation succeed inode will have new > >> i_uid or i_gid but accounted to old dquot. This behaviour > >> is differ from dquot_initialize(). Let's handle IO error from > >> dqget() equally in all functions. > >> > >> In appliance to dquot_transfer() this means that we have to finish > >> operation regardless to IO errors from dqget(). > > In principle, the patch is fine (see just about a bug below). But even > > better would be if you converted dquot_transfer() to actually return real > > return codes (0, -EDQUOT, -EIO...) and make it return EIO in case of IO > Actually we have following set of errors (0, -EDQUOT, -EIO, -ENOMEM, -ENOSPC) > And ENOSPC is more realistic than EIO or ENOMEM. But from other point of view > quota is some sort of fs meta-data, so we can not let it just *silently* fail > because of some error as it done at the moment. Exactly. That's why I'd like to get this cleaned up in the long run... Honza -- Jan Kara <jack@suse.cz> SUSE Labs, CR ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 4/5] quota: Move duplicated code to separate functions 2009-12-14 12:21 ` [PATCH 4/5] quota: Move duplicated code to separate functions Dmitry Monakhov 2009-12-14 12:21 ` [PATCH 5/5] quota: handle IO errors in dquot_transfer() Dmitry Monakhov @ 2009-12-14 18:00 ` Jan Kara 1 sibling, 0 replies; 14+ messages in thread From: Jan Kara @ 2009-12-14 18:00 UTC (permalink / raw) To: Dmitry Monakhov; +Cc: Jan Kara, linux-fsdevel On Mon 14-12-09 15:21:15, Dmitry Monakhov wrote: > - for(..) { mark_dquot_dirty(); } -> mark_all_dquot_dirty() > - for(..) { dput_all(); } -> dqput_all() > > Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> > --- > fs/quota/dquot.c | 75 ++++++++++++++++++++++++++---------------------------- > 1 files changed, 36 insertions(+), 39 deletions(-) > > diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c > index 942ea0b..d6d535c 100644 > --- a/fs/quota/dquot.c > +++ b/fs/quota/dquot.c > @@ -327,6 +327,28 @@ int dquot_mark_dquot_dirty(struct dquot *dquot) > } > EXPORT_SYMBOL(dquot_mark_dquot_dirty); > > +/* Dirtify all the dquots - this can block when journalling */ > +static inline int mark_all_dquot_dirty(struct dquot **dquot) > +{ > + int ret, err, cnt; Added empty line here. > + ret = err = 0; > + for (cnt = 0; cnt < MAXQUOTAS; cnt++) { > + if (dquot[cnt]) > + /* Even in case of error we have to continue */ > + ret = mark_dquot_dirty(dquot[cnt]); > + if (!err) > + err = ret; > + } > + return err; > +} > + > +static inline void dqput_all(struct dquot **dquot) > +{ > + unsigned int cnt; Added empty line here. > + for (cnt = 0; cnt < MAXQUOTAS; cnt++) > + dqput(dquot[cnt]); > +} ... > @@ -1569,10 +1585,7 @@ int dquot_alloc_inode(const struct inode *inode, qsize_t number) > warn_put_all: > spin_unlock(&dq_data_lock); > if (ret == QUOTA_OK) > - /* Dirtify all the dquots - this can block when journalling */ > - for (cnt = 0; cnt < MAXQUOTAS; cnt++) > - if (inode->i_dquot[cnt]) > - mark_dquot_dirty(inode->i_dquot[cnt]); > + mark_all_dquot_dirty((struct dquot **)inode->i_dquot); Removed the typecast here. > @@ -1706,10 +1713,7 @@ int dquot_free_inode(const struct inode *inode, qsize_t number) > dquot_decr_inodes(inode->i_dquot[cnt], number); > } > spin_unlock(&dq_data_lock); > - /* Dirtify all the dquots - this can block when journalling */ > - for (cnt = 0; cnt < MAXQUOTAS; cnt++) > - if (inode->i_dquot[cnt]) > - mark_dquot_dirty(inode->i_dquot[cnt]); > + mark_all_dquot_dirty((struct dquot **)inode->i_dquot); Removed the typecast here. ... Merged the patch into my tree. Thanks. Honza -- Jan Kara <jack@suse.cz> SUSE Labs, CR ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 3/5] ext4: Convert to generic reserved quota's space management. [V5] 2009-12-14 12:21 ` [PATCH 3/5] ext4: Convert to generic reserved quota's space management. [V5] Dmitry Monakhov 2009-12-14 12:21 ` [PATCH 4/5] quota: Move duplicated code to separate functions Dmitry Monakhov @ 2009-12-14 14:35 ` tytso 2009-12-14 17:26 ` Jan Kara 2009-12-14 17:25 ` Jan Kara 2 siblings, 1 reply; 14+ messages in thread From: tytso @ 2009-12-14 14:35 UTC (permalink / raw) To: Dmitry Monakhov; +Cc: Jan Kara, linux-fsdevel, linux-ext4 On Mon, Dec 14, 2009 at 03:21:14PM +0300, Dmitry Monakhov wrote: > This patch fix write vs chown race condition. > > Changes from V4 > - Coding style cleanup according to Jan's comment. > Changes from V2: > - add missed i_reserved_quota iniatilization. > > Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Acked-by: "Theodore Ts'o" <tytso@mit.edu> Jan, I assume you'll push this along with the other quota-related changes out from your tree? We have some ext4-related quota bugs that I think may be related to this fix (and a bunch of distros are planning on using 2.6.32), so if you could review this and get pushed this patch series pushed out for 2.6.32 it would be appreciated, thanks. - Ted ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 3/5] ext4: Convert to generic reserved quota's space management. [V5] 2009-12-14 14:35 ` [PATCH 3/5] ext4: Convert to generic reserved quota's space management. [V5] tytso @ 2009-12-14 17:26 ` Jan Kara 0 siblings, 0 replies; 14+ messages in thread From: Jan Kara @ 2009-12-14 17:26 UTC (permalink / raw) To: tytso; +Cc: Dmitry Monakhov, Jan Kara, linux-fsdevel, linux-ext4 On Mon 14-12-09 09:35:51, tytso@mit.edu wrote: > On Mon, Dec 14, 2009 at 03:21:14PM +0300, Dmitry Monakhov wrote: > > This patch fix write vs chown race condition. > > > > Changes from V4 > > - Coding style cleanup according to Jan's comment. > > Changes from V2: > > - add missed i_reserved_quota iniatilization. > > > > Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> > > Acked-by: "Theodore Ts'o" <tytso@mit.edu> > > Jan, I assume you'll push this along with the other quota-related > changes out from your tree? We have some ext4-related quota bugs that > I think may be related to this fix (and a bunch of distros are > planning on using 2.6.32), so if you could review this and get pushed > this patch series pushed out for 2.6.32 it would be appreciated, > thanks. Yes, I'll push it via my tree and take care of pushing into 2.6.32 stable tree as well. Honza -- Jan Kara <jack@suse.cz> SUSE Labs, CR ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 3/5] ext4: Convert to generic reserved quota's space management. [V5] 2009-12-14 12:21 ` [PATCH 3/5] ext4: Convert to generic reserved quota's space management. [V5] Dmitry Monakhov 2009-12-14 12:21 ` [PATCH 4/5] quota: Move duplicated code to separate functions Dmitry Monakhov 2009-12-14 14:35 ` [PATCH 3/5] ext4: Convert to generic reserved quota's space management. [V5] tytso @ 2009-12-14 17:25 ` Jan Kara 2 siblings, 0 replies; 14+ messages in thread From: Jan Kara @ 2009-12-14 17:25 UTC (permalink / raw) To: Dmitry Monakhov; +Cc: Jan Kara, linux-fsdevel, linux-ext4 On Mon 14-12-09 15:21:14, Dmitry Monakhov wrote: > This patch fix write vs chown race condition. > > Changes from V4 > - Coding style cleanup according to Jan's comment. > Changes from V2: > - add missed i_reserved_quota iniatilization. Looks good, merged into my tree. Honza -- Jan Kara <jack@suse.cz> SUSE Labs, CR ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/5] quota: decouple fs reserved space from quota reservation. [V5] 2009-12-14 12:21 ` [PATCH 2/5] quota: decouple fs reserved space from quota reservation. [V5] Dmitry Monakhov 2009-12-14 12:21 ` [PATCH 3/5] ext4: Convert to generic reserved quota's space management. [V5] Dmitry Monakhov @ 2009-12-14 17:24 ` Jan Kara 1 sibling, 0 replies; 14+ messages in thread From: Jan Kara @ 2009-12-14 17:24 UTC (permalink / raw) To: Dmitry Monakhov; +Cc: Jan Kara, linux-fsdevel, linux-ext4 On Mon 14-12-09 15:21:13, Dmitry Monakhov wrote: > Currently inode_reservation is managed by fs itself and this > reservation is transfered on dquot_transfer(). This means what > inode_reservation must always be in sync with > dquot->dq_dqb.dqb_rsvspace. Otherwise dquot_transfer() will result > in incorrect quota(WARN_ON in dquot_claim_reserved_space() will be > triggered) > This is not easy because of complex locking order issues > for example http://bugzilla.kernel.org/show_bug.cgi?id=14739 > > The patch introduce quota reservation field for each fs-inode > (fs specific inode is used in order to prevent bloating generic > vfs inode). This reservation is managed by quota code internally > similar to i_blocks/i_bytes and may not be always in sync with > internal fs reservation. > > Also perform some code rearrangement: > - Unify dquot_reserve_space() and dquot_reserve_space() > - Unify dquot_release_reserved_space() and dquot_free_space() > - Also this patch add missing warning update to release_rsv() > dquot_release_reserved_space() must call flush_warnings() as > dquot_free_space() does. > > Changes from V4 > - fixes and cleanups according to Jan's comments. > Changes from V3: > - fix deadlock in dquota_alloc_space in journalled mode with nodelalloc > Changes from V1: > - move qutoa_reservation field from vfs_inode to fs_inode > - account reservation in bytes instead of blocks. Looks good. I've just slightly updated changelog and below typecast (doing just __dquot_free_space(inode, number, 1)) and merged the patch to my tree. > +void dquot_release_reserved_space(struct inode *inode, qsize_t number) > +{ > + return (void )__dquot_free_space(inode, number, 1); > + > +} Honza -- Jan Kara <jack@suse.cz> SUSE Labs, CR ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/5] Add unlocked version of inode_add_bytes() function 2009-12-14 12:21 [PATCH 1/5] Add unlocked version of inode_add_bytes() function Dmitry Monakhov 2009-12-14 12:21 ` [PATCH 2/5] quota: decouple fs reserved space from quota reservation. [V5] Dmitry Monakhov @ 2009-12-14 17:21 ` Jan Kara 1 sibling, 0 replies; 14+ messages in thread From: Jan Kara @ 2009-12-14 17:21 UTC (permalink / raw) To: Dmitry Monakhov; +Cc: Jan Kara, linux-fsdevel On Mon 14-12-09 15:21:12, Dmitry Monakhov wrote: > Quota code requires unlocked version of this function. Off course > we can just copy-paste the code, but copy-pasting is always an evil. Looks good. Merged into my tree. Honza > Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> > --- > fs/stat.c | 10 ++++++++-- > include/linux/fs.h | 1 + > 2 files changed, 9 insertions(+), 2 deletions(-) > > diff --git a/fs/stat.c b/fs/stat.c > index 075694e..c4ecd52 100644 > --- a/fs/stat.c > +++ b/fs/stat.c > @@ -401,9 +401,9 @@ SYSCALL_DEFINE4(fstatat64, int, dfd, char __user *, filename, > } > #endif /* __ARCH_WANT_STAT64 */ > > -void inode_add_bytes(struct inode *inode, loff_t bytes) > +/* Caller is here responsible for sufficient locking (ie. inode->i_lock) */ > +void __inode_add_bytes(struct inode *inode, loff_t bytes) > { > - spin_lock(&inode->i_lock); > inode->i_blocks += bytes >> 9; > bytes &= 511; > inode->i_bytes += bytes; > @@ -411,6 +411,12 @@ void inode_add_bytes(struct inode *inode, loff_t bytes) > inode->i_blocks++; > inode->i_bytes -= 512; > } > +} > + > +void inode_add_bytes(struct inode *inode, loff_t bytes) > +{ > + spin_lock(&inode->i_lock); > + __inode_add_bytes(inode, bytes); > spin_unlock(&inode->i_lock); > } > > diff --git a/include/linux/fs.h b/include/linux/fs.h > index 2620a8c..98ea200 100644 > --- a/include/linux/fs.h > +++ b/include/linux/fs.h > @@ -2314,6 +2314,7 @@ extern const struct inode_operations page_symlink_inode_operations; > extern int generic_readlink(struct dentry *, char __user *, int); > extern void generic_fillattr(struct inode *, struct kstat *); > extern int vfs_getattr(struct vfsmount *, struct dentry *, struct kstat *); > +void __inode_add_bytes(struct inode *inode, loff_t bytes); > void inode_add_bytes(struct inode *inode, loff_t bytes); > void inode_sub_bytes(struct inode *inode, loff_t bytes); > loff_t inode_get_bytes(struct inode *inode); > -- > 1.6.0.4 > -- Jan Kara <jack@suse.cz> SUSE Labs, CR ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2009-12-15 11:49 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-12-14 12:21 [PATCH 1/5] Add unlocked version of inode_add_bytes() function Dmitry Monakhov 2009-12-14 12:21 ` [PATCH 2/5] quota: decouple fs reserved space from quota reservation. [V5] Dmitry Monakhov 2009-12-14 12:21 ` [PATCH 3/5] ext4: Convert to generic reserved quota's space management. [V5] Dmitry Monakhov 2009-12-14 12:21 ` [PATCH 4/5] quota: Move duplicated code to separate functions Dmitry Monakhov 2009-12-14 12:21 ` [PATCH 5/5] quota: handle IO errors in dquot_transfer() Dmitry Monakhov 2009-12-14 18:41 ` Jan Kara 2009-12-14 22:43 ` Dmitry Monakhov 2009-12-15 11:49 ` Jan Kara 2009-12-14 18:00 ` [PATCH 4/5] quota: Move duplicated code to separate functions Jan Kara 2009-12-14 14:35 ` [PATCH 3/5] ext4: Convert to generic reserved quota's space management. [V5] tytso 2009-12-14 17:26 ` Jan Kara 2009-12-14 17:25 ` Jan Kara 2009-12-14 17:24 ` [PATCH 2/5] quota: decouple fs reserved space from quota reservation. [V5] Jan Kara 2009-12-14 17:21 ` [PATCH 1/5] Add unlocked version of inode_add_bytes() function Jan Kara
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).