* Re: [PATCH] ext4: Always journal quota file modifications
2010-06-02 14:23 [PATCH] ext4: Always journal quota file modifications Jan Kara
@ 2010-06-02 21:39 ` Eric Sandeen
2010-06-02 23:46 ` Jan Kara
2010-06-03 9:07 ` Dmitry Monakhov
` (2 subsequent siblings)
3 siblings, 1 reply; 13+ messages in thread
From: Eric Sandeen @ 2010-06-02 21:39 UTC (permalink / raw)
To: Jan Kara; +Cc: tytso, linux-ext4
Jan Kara wrote:
> When journaled quota options are not specified, we do writes
> to quota files just in data=ordered mode. This actually causes
> warnings from JBD2 about dirty journaled buffer because ext4_getblk
> unconditionally treats a block allocated by it as metadata. Since
> quota actually is filesystem metadata, the easiest way to get rid
> of the warning is to always treat quota writes as metadata...
>
> Signed-off-by: Jan Kara <jack@suse.cz>
Seems like a good approach to me, thanks Jan.
(aside: how big an impact does this have on journal traffic for a busy
quota-controlled fs?)
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
> ---
> fs/ext4/super.c | 19 +++++--------------
> 1 files changed, 5 insertions(+), 14 deletions(-)
>
> Ted, this patch fixes some JBD2 warning for me when running XFSQA
> with quotas enabled. I think this is a move into a direction you are
> trying to achieve as well. Will you merge the patch or should I do it?
>
> Honza
>
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index 4e8983a..a8cea08 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -4030,7 +4030,6 @@ static ssize_t ext4_quota_write(struct super_block *sb, int type,
> ext4_lblk_t blk = off >> EXT4_BLOCK_SIZE_BITS(sb);
> int err = 0;
> int offset = off & (sb->s_blocksize - 1);
> - int journal_quota = EXT4_SB(sb)->s_qf_names[type] != NULL;
> struct buffer_head *bh;
> handle_t *handle = journal_current_handle();
>
> @@ -4055,24 +4054,16 @@ static ssize_t ext4_quota_write(struct super_block *sb, int type,
> bh = ext4_bread(handle, inode, blk, 1, &err);
> if (!bh)
> goto out;
> - if (journal_quota) {
> - err = ext4_journal_get_write_access(handle, bh);
> - if (err) {
> - brelse(bh);
> - goto out;
> - }
> + err = ext4_journal_get_write_access(handle, bh);
> + if (err) {
> + brelse(bh);
> + goto out;
> }
> lock_buffer(bh);
> memcpy(bh->b_data+offset, data, len);
> flush_dcache_page(bh->b_page);
> unlock_buffer(bh);
> - if (journal_quota)
> - err = ext4_handle_dirty_metadata(handle, NULL, bh);
> - else {
> - /* Always do at least ordered writes for quotas */
> - err = ext4_jbd2_file_inode(handle, inode);
> - mark_buffer_dirty(bh);
> - }
> + err = ext4_handle_dirty_metadata(handle, NULL, bh);
> brelse(bh);
> out:
> if (err) {
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] ext4: Always journal quota file modifications
2010-06-02 21:39 ` Eric Sandeen
@ 2010-06-02 23:46 ` Jan Kara
0 siblings, 0 replies; 13+ messages in thread
From: Jan Kara @ 2010-06-02 23:46 UTC (permalink / raw)
To: Eric Sandeen; +Cc: Jan Kara, tytso, linux-ext4
On Wed 02-06-10 16:39:59, Eric Sandeen wrote:
> Jan Kara wrote:
> > When journaled quota options are not specified, we do writes
> > to quota files just in data=ordered mode. This actually causes
> > warnings from JBD2 about dirty journaled buffer because ext4_getblk
> > unconditionally treats a block allocated by it as metadata. Since
> > quota actually is filesystem metadata, the easiest way to get rid
> > of the warning is to always treat quota writes as metadata...
> >
> > Signed-off-by: Jan Kara <jack@suse.cz>
>
> Seems like a good approach to me, thanks Jan.
>
> (aside: how big an impact does this have on journal traffic for a busy
> quota-controlled fs?)
With quota a bit tricky thing is that amount of load depends on the
amount of different groups / users doing IO (and to some extent on how
much these users / groups share blocks of quota file). One can make a
simplistic upper estimate: 1 journalled block per user / group who modified
something in a transaction...
Just out of curiosity I have created 10000 files in 10 directories, each
file belonging to a different user. Then I run:
sync; time { for (( i = 0; i < 10000; i++ )); do echo "aaaaaaaa"
>dir$((i/1000))/file$((i%1000)); done; sync; }
and
sync; time { for (( i = 0; i < 10000; i++ )); do
>dir$((i/1000))/file$((i%1000)); done; sync; }
With journaled quotas enabled the times were pretty consistently around
2.9s for writes and 2.1s for truncates, without quotas the times were
around 2.7s for writes and 1.7s for truncates. So this gives some rough
idea how much quota costs for this rather extreme use case.
Another case where quota impact is measurable is when several threads
bang files of the same user. Then we hit lock contention on the single dquot
structure, most notably on journaling it whenever it gets modified. Dmitry
was looking into addressing this...
Honza
> Reviewed-by: Eric Sandeen <sandeen@redhat.com>
>
> > ---
> > fs/ext4/super.c | 19 +++++--------------
> > 1 files changed, 5 insertions(+), 14 deletions(-)
> >
> > Ted, this patch fixes some JBD2 warning for me when running XFSQA
> > with quotas enabled. I think this is a move into a direction you are
> > trying to achieve as well. Will you merge the patch or should I do it?
> >
> > Honza
> >
> > diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> > index 4e8983a..a8cea08 100644
> > --- a/fs/ext4/super.c
> > +++ b/fs/ext4/super.c
> > @@ -4030,7 +4030,6 @@ static ssize_t ext4_quota_write(struct super_block *sb, int type,
> > ext4_lblk_t blk = off >> EXT4_BLOCK_SIZE_BITS(sb);
> > int err = 0;
> > int offset = off & (sb->s_blocksize - 1);
> > - int journal_quota = EXT4_SB(sb)->s_qf_names[type] != NULL;
> > struct buffer_head *bh;
> > handle_t *handle = journal_current_handle();
> >
> > @@ -4055,24 +4054,16 @@ static ssize_t ext4_quota_write(struct super_block *sb, int type,
> > bh = ext4_bread(handle, inode, blk, 1, &err);
> > if (!bh)
> > goto out;
> > - if (journal_quota) {
> > - err = ext4_journal_get_write_access(handle, bh);
> > - if (err) {
> > - brelse(bh);
> > - goto out;
> > - }
> > + err = ext4_journal_get_write_access(handle, bh);
> > + if (err) {
> > + brelse(bh);
> > + goto out;
> > }
> > lock_buffer(bh);
> > memcpy(bh->b_data+offset, data, len);
> > flush_dcache_page(bh->b_page);
> > unlock_buffer(bh);
> > - if (journal_quota)
> > - err = ext4_handle_dirty_metadata(handle, NULL, bh);
> > - else {
> > - /* Always do at least ordered writes for quotas */
> > - err = ext4_jbd2_file_inode(handle, inode);
> > - mark_buffer_dirty(bh);
> > - }
> > + err = ext4_handle_dirty_metadata(handle, NULL, bh);
> > brelse(bh);
> > out:
> > if (err) {
>
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] ext4: Always journal quota file modifications
2010-06-02 14:23 [PATCH] ext4: Always journal quota file modifications Jan Kara
2010-06-02 21:39 ` Eric Sandeen
@ 2010-06-03 9:07 ` Dmitry Monakhov
2010-06-03 11:54 ` Jan Kara
2010-06-03 12:53 ` tytso
2010-07-05 20:08 ` Eric Sandeen
3 siblings, 1 reply; 13+ messages in thread
From: Dmitry Monakhov @ 2010-06-03 9:07 UTC (permalink / raw)
To: Jan Kara; +Cc: tytso, linux-ext4
Jan Kara <jack@suse.cz> writes:
> When journaled quota options are not specified, we do writes
> to quota files just in data=ordered mode. This actually causes
> warnings from JBD2 about dirty journaled buffer because ext4_getblk
> unconditionally treats a block allocated by it as metadata. Since
> quota actually is filesystem metadata, the easiest way to get rid
> of the warning is to always treat quota writes as metadata...
Absolutely agree with the fix, but I have a theoretical question.
Is is possible to solve the issue without handling quota's bh
via ext4_handle_dirty_metadata()?
As soon as i understand ext4_jbd2_file_inode() not works here
because bh is belongs to blkdev page-cache. In other words is it
possible provide ordering for blkdev's blocks in jbd2?
>
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
> fs/ext4/super.c | 19 +++++--------------
> 1 files changed, 5 insertions(+), 14 deletions(-)
>
> Ted, this patch fixes some JBD2 warning for me when running XFSQA
> with quotas enabled. I think this is a move into a direction you are
> trying to achieve as well. Will you merge the patch or should I do it?
>
> Honza
>
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index 4e8983a..a8cea08 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -4030,7 +4030,6 @@ static ssize_t ext4_quota_write(struct super_block *sb, int type,
> ext4_lblk_t blk = off >> EXT4_BLOCK_SIZE_BITS(sb);
> int err = 0;
> int offset = off & (sb->s_blocksize - 1);
> - int journal_quota = EXT4_SB(sb)->s_qf_names[type] != NULL;
> struct buffer_head *bh;
> handle_t *handle = journal_current_handle();
>
> @@ -4055,24 +4054,16 @@ static ssize_t ext4_quota_write(struct super_block *sb, int type,
> bh = ext4_bread(handle, inode, blk, 1, &err);
> if (!bh)
> goto out;
> - if (journal_quota) {
> - err = ext4_journal_get_write_access(handle, bh);
> - if (err) {
> - brelse(bh);
> - goto out;
> - }
> + err = ext4_journal_get_write_access(handle, bh);
> + if (err) {
> + brelse(bh);
> + goto out;
> }
> lock_buffer(bh);
> memcpy(bh->b_data+offset, data, len);
> flush_dcache_page(bh->b_page);
> unlock_buffer(bh);
> - if (journal_quota)
> - err = ext4_handle_dirty_metadata(handle, NULL, bh);
> - else {
> - /* Always do at least ordered writes for quotas */
> - err = ext4_jbd2_file_inode(handle, inode);
> - mark_buffer_dirty(bh);
> - }
> + err = ext4_handle_dirty_metadata(handle, NULL, bh);
> brelse(bh);
> out:
> if (err) {
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] ext4: Always journal quota file modifications
2010-06-03 9:07 ` Dmitry Monakhov
@ 2010-06-03 11:54 ` Jan Kara
0 siblings, 0 replies; 13+ messages in thread
From: Jan Kara @ 2010-06-03 11:54 UTC (permalink / raw)
To: Dmitry Monakhov; +Cc: Jan Kara, tytso, linux-ext4
On Thu 03-06-10 13:07:41, Dmitry Monakhov wrote:
> Jan Kara <jack@suse.cz> writes:
>
> > When journaled quota options are not specified, we do writes
> > to quota files just in data=ordered mode. This actually causes
> > warnings from JBD2 about dirty journaled buffer because ext4_getblk
> > unconditionally treats a block allocated by it as metadata. Since
> > quota actually is filesystem metadata, the easiest way to get rid
> > of the warning is to always treat quota writes as metadata...
> Absolutely agree with the fix, but I have a theoretical question.
> Is is possible to solve the issue without handling quota's bh
> via ext4_handle_dirty_metadata()?
> As soon as i understand ext4_jbd2_file_inode() not works here
> because bh is belongs to blkdev page-cache. In other words is it
> possible provide ordering for blkdev's blocks in jbd2?
You are right. It's not possible to provide ordering for blkdev's blocks
in JBD2 (unlike JBD). So a different solution would be rather problematic.
Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] ext4: Always journal quota file modifications
2010-06-02 14:23 [PATCH] ext4: Always journal quota file modifications Jan Kara
2010-06-02 21:39 ` Eric Sandeen
2010-06-03 9:07 ` Dmitry Monakhov
@ 2010-06-03 12:53 ` tytso
2010-06-03 14:13 ` Dmitry Monakhov
` (2 more replies)
2010-07-05 20:08 ` Eric Sandeen
3 siblings, 3 replies; 13+ messages in thread
From: tytso @ 2010-06-03 12:53 UTC (permalink / raw)
To: Jan Kara; +Cc: linux-ext4
On Wed, Jun 02, 2010 at 04:23:13PM +0200, Jan Kara wrote:
> When journaled quota options are not specified, we do writes
> to quota files just in data=ordered mode. This actually causes
> warnings from JBD2 about dirty journaled buffer because ext4_getblk
> unconditionally treats a block allocated by it as metadata. Since
> quota actually is filesystem metadata, the easiest way to get rid
> of the warning is to always treat quota writes as metadata...
>
> Signed-off-by: Jan Kara <jack@suse.cz>
I'm worried about this patch in the short-term. In the long-term I
think the quota file should become a special file much like the
journal, and then this makes a huge amount of sense. But I worry
about what might happen if (a) someone tries writing to the quota file
directly from userspace, maybe right before quota is enabled (and
before delayed allocation writes complete, such that some writes are
happening via the journal in ext4_quota_write and some w/o going
through the journal in ext4_writepage), and (b) what happens if quota
is disabled, the quota file is deleted, and some blocks get reused ---
and then system crashes before a journal commit can happen.
All of these problems go away if the quota file isn't visible from
userspace, and it becomes a special file. In the short term I think
we could make this change, but I think we would also have to (1) treat
the quota file as immutable while quotas are enabled (so it cannot be
opened for writing), (2) force an fsync of the quota file and a
journal commit before enabling quotas, and (3) force a journal commit
after disabling quotas.
The other thing we might try that might mostly fix things is to change
ext4_should_journal_data() in ext4_jbd2.h to return true if it's a
quota file --- but we don't know which files are the quota files when
quotas are disabled, so we would still need to do (2) and (3). But
this would allow us to write to the quota file while quotas are
enabled, if we think this is necessary --- although I think it's a bad
idea, so I'd be in favor of simply not allowing quota files to be
writable from userspace while quotas are enabled. Jan, is this going
to cause any problems with quotautils?
OTOH, I think we have similar races with journaled quotas, and no one
has complained (although the vast majority of the quota documentation
on various HOWTO pages still don't talk about journaled quotas, so I
don't know how many people are using journaled quotas. :-/ )
> Ted, this patch fixes some JBD2 warning for me when running XFSQA
> with quotas enabled. I think this is a move into a direction you are
> trying to achieve as well. Will you merge the patch or should I do it?
I'm happy to carry the patch, since I Have Plans to try to make quotas
be a first class supported filesystem feature (i.e., make the quota
file a special file, and make quota files be always journaled if they
are journaled, and make the !@#! magic quota options handling in
/proc/mounts go away) in the 2.6.36 timeframe.
So the question is should we try to merge something like this for
2.6.35 or 2.6.35.y, and if so, how much bullet-proofing do we feel is
necessary for some of these races that I've outlined above.
- Ted
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] ext4: Always journal quota file modifications
2010-06-03 12:53 ` tytso
@ 2010-06-03 14:13 ` Dmitry Monakhov
2010-06-03 14:19 ` Jan Kara
2010-06-03 16:47 ` Bernd Schubert
2 siblings, 0 replies; 13+ messages in thread
From: Dmitry Monakhov @ 2010-06-03 14:13 UTC (permalink / raw)
To: tytso; +Cc: Jan Kara, linux-ext4
tytso@mit.edu writes:
> On Wed, Jun 02, 2010 at 04:23:13PM +0200, Jan Kara wrote:
>> When journaled quota options are not specified, we do writes
>> to quota files just in data=ordered mode. This actually causes
>> warnings from JBD2 about dirty journaled buffer because ext4_getblk
>> unconditionally treats a block allocated by it as metadata. Since
>> quota actually is filesystem metadata, the easiest way to get rid
>> of the warning is to always treat quota writes as metadata...
>>
>> Signed-off-by: Jan Kara <jack@suse.cz>
>
> I'm worried about this patch in the short-term. In the long-term I
> think the quota file should become a special file much like the
> journal, and then this makes a huge amount of sense. But I worry
> about what might happen if (a) someone tries writing to the quota file
> directly from userspace, maybe right before quota is enabled (and
> before delayed allocation writes complete, such that some writes are
> happening via the journal in ext4_quota_write and some w/o going
> through the journal in ext4_writepage), and (b) what happens if quota
> is disabled, the quota file is deleted, and some blocks get reused ---
> and then system crashes before a journal commit can happen.
>
> All of these problems go away if the quota file isn't visible from
> userspace, and it becomes a special file. In the short term I think
> we could make this change, but I think we would also have to (1) treat
> the quota file as immutable while quotas are enabled (so it cannot be
> opened for writing), (2) force an fsync of the quota file and a
> journal commit before enabling quotas, and (3) force a journal commit
> after disabling quotas.
3'rd is already true
int vfs_quota_disable(...)
{...
/* Sync the superblock so that buffers with quota data are written to
* disk (and so userspace sees correct data afterwards). */
if (sb->s_op->sync_fs)
sb->s_op->sync_fs(sb, 1);
sync_blockdev(sb->s_bdev);
/* Now the quota files are just ordinary files and we can set the
* inode flags back. Moreover we discard the pagecache so that
...}
>
> The other thing we might try that might mostly fix things is to change
> ext4_should_journal_data() in ext4_jbd2.h to return true if it's a
> quota file --- but we don't know which files are the quota files when
> quotas are disabled, so we would still need to do (2) and (3). But
> this would allow us to write to the quota file while quotas are
> enabled, if we think this is necessary --- although I think it's a bad
> idea, so I'd be in favor of simply not allowing quota files to be
> writable from userspace while quotas are enabled. Jan, is this going
> to cause any problems with quotautils?
>
> OTOH, I think we have similar races with journaled quotas, and no one
> has complained (although the vast majority of the quota documentation
> on various HOWTO pages still don't talk about journaled quotas, so I
> don't know how many people are using journaled quotas. :-/ )
>
>> Ted, this patch fixes some JBD2 warning for me when running XFSQA
>> with quotas enabled. I think this is a move into a direction you are
>> trying to achieve as well. Will you merge the patch or should I do it?
>
> I'm happy to carry the patch, since I Have Plans to try to make quotas
> be a first class supported filesystem feature (i.e., make the quota
> file a special file, and make quota files be always journaled if they
> are journaled, and make the !@#! magic quota options handling in
> /proc/mounts go away) in the 2.6.36 timeframe.
>
> So the question is should we try to merge something like this for
> 2.6.35 or 2.6.35.y, and if so, how much bullet-proofing do we feel is
> necessary for some of these races that I've outlined above.
>
> - Ted
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] ext4: Always journal quota file modifications
2010-06-03 12:53 ` tytso
2010-06-03 14:13 ` Dmitry Monakhov
@ 2010-06-03 14:19 ` Jan Kara
2010-06-03 17:10 ` tytso
2010-06-03 16:47 ` Bernd Schubert
2 siblings, 1 reply; 13+ messages in thread
From: Jan Kara @ 2010-06-03 14:19 UTC (permalink / raw)
To: tytso; +Cc: Jan Kara, linux-ext4
On Thu 03-06-10 08:53:12, tytso@mit.edu wrote:
> On Wed, Jun 02, 2010 at 04:23:13PM +0200, Jan Kara wrote:
> > When journaled quota options are not specified, we do writes
> > to quota files just in data=ordered mode. This actually causes
> > warnings from JBD2 about dirty journaled buffer because ext4_getblk
> > unconditionally treats a block allocated by it as metadata. Since
> > quota actually is filesystem metadata, the easiest way to get rid
> > of the warning is to always treat quota writes as metadata...
> >
> > Signed-off-by: Jan Kara <jack@suse.cz>
>
> I'm worried about this patch in the short-term. In the long-term I
> think the quota file should become a special file much like the
> journal, and then this makes a huge amount of sense. But I worry
> about what might happen if (a) someone tries writing to the quota file
> directly from userspace, maybe right before quota is enabled (and
> before delayed allocation writes complete, such that some writes are
> happening via the journal in ext4_quota_write and some w/o going
> through the journal in ext4_writepage), and (b) what happens if quota
> is disabled, the quota file is deleted, and some blocks get reused ---
> and then system crashes before a journal commit can happen.
>
> All of these problems go away if the quota file isn't visible from
> userspace, and it becomes a special file. In the short term I think
> we could make this change, but I think we would also have to (1) treat
> the quota file as immutable while quotas are enabled (so it cannot be
> opened for writing), (2) force an fsync of the quota file and a
> journal commit before enabling quotas, and (3) force a journal commit
> after disabling quotas.
Ted, that's what generic quota code actually does for you (unless
DQUOT_QUOTA_SYS_FILE flag is specified but that's not the case of ext?)
- see vfs_load_quota_inode. We do:
sync_filesystem(sb);
invalidate_bdev(sb->s_bdev);
..
inode->i_flags |= S_NOQUOTA | S_NOATIME | S_IMMUTABLE;
..
So unless someone tries to screw us really hard, we should be fine.
> The other thing we might try that might mostly fix things is to change
> ext4_should_journal_data() in ext4_jbd2.h to return true if it's a
> quota file --- but we don't know which files are the quota files when
> quotas are disabled, so we would still need to do (2) and (3). But
> this would allow us to write to the quota file while quotas are
> enabled, if we think this is necessary --- although I think it's a bad
> idea, so I'd be in favor of simply not allowing quota files to be
> writable from userspace while quotas are enabled. Jan, is this going
> to cause any problems with quotautils?
quotautils do not write to quota files when quota is turned on (only when
it is turned off). They can only read from the files while quota is on
(repquota, warnquota).
> OTOH, I think we have similar races with journaled quotas, and no one
> has complained (although the vast majority of the quota documentation
> on various HOWTO pages still don't talk about journaled quotas, so I
> don't know how many people are using journaled quotas. :-/ )
Well, at least in SUSE if you enable in Yast2 that you want quotas on the
filesystem, it will use journaled quotas for a few releases already... So
I think it is used at least by some users.
> > Ted, this patch fixes some JBD2 warning for me when running XFSQA
> > with quotas enabled. I think this is a move into a direction you are
> > trying to achieve as well. Will you merge the patch or should I do it?
>
> I'm happy to carry the patch, since I Have Plans to try to make quotas
> be a first class supported filesystem feature (i.e., make the quota
> file a special file, and make quota files be always journaled if they
> are journaled, and make the !@#! magic quota options handling in
> /proc/mounts go away) in the 2.6.36 timeframe.
>
> So the question is should we try to merge something like this for
> 2.6.35 or 2.6.35.y, and if so, how much bullet-proofing do we feel is
> necessary for some of these races that I've outlined above.
Yes, I think we can push the patch to 2.6.35 and maybe to 2.6.34.y.
Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] ext4: Always journal quota file modifications
2010-06-03 14:19 ` Jan Kara
@ 2010-06-03 17:10 ` tytso
2010-06-04 14:45 ` Jan Kara
0 siblings, 1 reply; 13+ messages in thread
From: tytso @ 2010-06-03 17:10 UTC (permalink / raw)
To: Jan Kara; +Cc: linux-ext4
On Thu, Jun 03, 2010 at 04:19:48PM +0200, Jan Kara wrote:
> > All of these problems go away if the quota file isn't visible from
> > userspace, and it becomes a special file. In the short term I think
> > we could make this change, but I think we would also have to (1) treat
> > the quota file as immutable while quotas are enabled (so it cannot be
> > opened for writing), (2) force an fsync of the quota file and a
> > journal commit before enabling quotas, and (3) force a journal commit
> > after disabling quotas.
> Ted, that's what generic quota code actually does for you (unless
> DQUOT_QUOTA_SYS_FILE flag is specified but that's not the case of ext?)
> - see vfs_load_quota_inode. We do:
> sync_filesystem(sb);
> invalidate_bdev(sb->s_bdev);
> ..
> inode->i_flags |= S_NOQUOTA | S_NOATIME | S_IMMUTABLE;
> ..
> So unless someone tries to screw us really hard, we should be fine.
That's good to hear. I think though we also need to call
sync_filesystem(sb) in dquot_disable(). Currently it calls
sb->s_op->sync_fs(), which forces out the superblock, and
sync_blockdev() which forces out any dirty buffer heads, but it
doesn't actually force a journal commit so that any pending journaled
writes to the quota file are forced out. We need to either explicitly
sync the quota files, or use sync_filesystem(sb) and sync everything.
The former might be more polite; in fact it might be sufficient in
vfs_load_quota_inode() as well? Or am I missing something?
- Ted
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] ext4: Always journal quota file modifications
2010-06-03 17:10 ` tytso
@ 2010-06-04 14:45 ` Jan Kara
0 siblings, 0 replies; 13+ messages in thread
From: Jan Kara @ 2010-06-04 14:45 UTC (permalink / raw)
To: tytso; +Cc: Jan Kara, linux-ext4
On Thu 03-06-10 13:10:58, tytso@mit.edu wrote:
> On Thu, Jun 03, 2010 at 04:19:48PM +0200, Jan Kara wrote:
> > > All of these problems go away if the quota file isn't visible from
> > > userspace, and it becomes a special file. In the short term I think
> > > we could make this change, but I think we would also have to (1) treat
> > > the quota file as immutable while quotas are enabled (so it cannot be
> > > opened for writing), (2) force an fsync of the quota file and a
> > > journal commit before enabling quotas, and (3) force a journal commit
> > > after disabling quotas.
> > Ted, that's what generic quota code actually does for you (unless
> > DQUOT_QUOTA_SYS_FILE flag is specified but that's not the case of ext?)
> > - see vfs_load_quota_inode. We do:
> > sync_filesystem(sb);
> > invalidate_bdev(sb->s_bdev);
> > ..
> > inode->i_flags |= S_NOQUOTA | S_NOATIME | S_IMMUTABLE;
> > ..
> > So unless someone tries to screw us really hard, we should be fine.
>
> That's good to hear. I think though we also need to call
> sync_filesystem(sb) in dquot_disable(). Currently it calls
> sb->s_op->sync_fs(), which forces out the superblock, and
> sync_blockdev() which forces out any dirty buffer heads, but it
> doesn't actually force a journal commit so that any pending journaled
> writes to the quota file are forced out.
sb->s_op->sync_fs() is ext4_sync_fs() which does:
flush_workqueue(sbi->dio_unwritten_wq);
if (jbd2_journal_start_commit(sbi->s_journal, &target)) {
if (wait)
jbd2_log_wait_commit(sbi->s_journal, target);
}
So it does force out a journal commit and thus quota data. Or am I
missing something?
> We need to either explicitly
> sync the quota files, or use sync_filesystem(sb) and sync everything.
> The former might be more polite; in fact it might be sufficient in
> vfs_load_quota_inode() as well? Or am I missing something?
Syncing quota files in vfs_load_quota_inode() is not enough because
for filesystems with blocksize < pagesize we could still have dirty
buffers in the same blockdev page as used by a quota file. Thus subsequent
invalidate_bdev() does not remove the blockdev's page and kernel will still
see old data (i.e., not new data written by e.g. setquota via page cache).
This cache aliasing with quotas is nasty...
Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] ext4: Always journal quota file modifications
2010-06-03 12:53 ` tytso
2010-06-03 14:13 ` Dmitry Monakhov
2010-06-03 14:19 ` Jan Kara
@ 2010-06-03 16:47 ` Bernd Schubert
2 siblings, 0 replies; 13+ messages in thread
From: Bernd Schubert @ 2010-06-03 16:47 UTC (permalink / raw)
To: linux-ext4
>
> OTOH, I think we have similar races with journaled quotas, and no one
> has complained (although the vast majority of the quota documentation
> on various HOWTO pages still don't talk about journaled quotas, so I
> don't know how many people are using journaled quotas. :-/ )
Lustre uses journal quotas since 1.6.5, which was released in 2008. While
there had been a number of quotas issues, I'm only aware of one issue related
to journaled quotas and that was also Lustre specific. Maybe Andreas knows
more.
Cheers,
Bernd
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] ext4: Always journal quota file modifications
2010-06-02 14:23 [PATCH] ext4: Always journal quota file modifications Jan Kara
` (2 preceding siblings ...)
2010-06-03 12:53 ` tytso
@ 2010-07-05 20:08 ` Eric Sandeen
2010-07-27 13:37 ` Ted Ts'o
3 siblings, 1 reply; 13+ messages in thread
From: Eric Sandeen @ 2010-07-05 20:08 UTC (permalink / raw)
To: Jan Kara; +Cc: tytso, linux-ext4
Jan Kara wrote:
> When journaled quota options are not specified, we do writes
> to quota files just in data=ordered mode. This actually causes
> warnings from JBD2 about dirty journaled buffer because ext4_getblk
> unconditionally treats a block allocated by it as metadata. Since
> quota actually is filesystem metadata, the easiest way to get rid
> of the warning is to always treat quota writes as metadata...
>
> Signed-off-by: Jan Kara <jack@suse.cz>
Where are we at on this one? I was trawling through my old RH bugs
today, and have one that this patch will resolve. Ted, did you
get comfortable with it?
Thanks,
-Eric
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] ext4: Always journal quota file modifications
2010-07-05 20:08 ` Eric Sandeen
@ 2010-07-27 13:37 ` Ted Ts'o
0 siblings, 0 replies; 13+ messages in thread
From: Ted Ts'o @ 2010-07-27 13:37 UTC (permalink / raw)
To: Eric Sandeen; +Cc: Jan Kara, linux-ext4
On Mon, Jul 05, 2010 at 03:08:12PM -0500, Eric Sandeen wrote:
> Jan Kara wrote:
> > When journaled quota options are not specified, we do writes
> > to quota files just in data=ordered mode. This actually causes
> > warnings from JBD2 about dirty journaled buffer because ext4_getblk
> > unconditionally treats a block allocated by it as metadata. Since
> > quota actually is filesystem metadata, the easiest way to get rid
> > of the warning is to always treat quota writes as metadata...
> >
> > Signed-off-by: Jan Kara <jack@suse.cz>
>
> Where are we at on this one? I was trawling through my old RH bugs
> today, and have one that this patch will resolve. Ted, did you
> get comfortable with it?
I've added this to the ext4 patch queue. (Going my backlog too,
before the merge window opens. :-)
- Ted
^ permalink raw reply [flat|nested] 13+ messages in thread