* kernel BUG at fs/jbd2/transaction.c:1086!
@ 2011-03-30 15:32 Lukas Czerner
2011-03-31 16:43 ` Jan Kara
0 siblings, 1 reply; 8+ messages in thread
From: Lukas Czerner @ 2011-03-30 15:32 UTC (permalink / raw)
To: linux-ext4; +Cc: jack, tytso, Eric Sandeen
Hello,
I have hit BUG_ON while running xfstest 234 in the loop
on the ext4. Backtrace below .
I thought that this problem was solved with
67eeb5685d2a211c0252ac7884142e503c759500 however it is still present.
I might be a bit hard to hit, but once you run
while true; do sync; sleep 0.3; done
concurrently it is reproducible almost all the time. I have tried to
understand what is going on but only thing I can see is this course
of action:
ext4_write_dquot
ext4_journal_start <- h_buffer_credits = 2
dquot_commit
v2_write_dquot
qtree_write_dquot
ext4_quota_write
ext4_handle_dirty_metadata <- this takes one block reserved
for journal
h_buffer_credits--
v2_write_file_info
ext4_quota_write
ext4_handle_dirty_metadata <- this takes second block reserved
for journal
h_buffer_credits--
ext4_journal_stop
However apparently I am missing something, because according to the
backtrace the second call of jbd2_journal_dirty_metadata() may end up
with BUG_ON -> J_ASSERT_JH(jh, handle->h_buffer_credits > 0);
So someone else is touching our handle apparently, but I go not exacly
know what is going on. From my simple debugging printk information I
have put with before ext4_handle_dirty_metadata() in ext4_quota_write()
we can see:
[ 226.450263] ext4_quota_write: len 48, off 5136, type 0 h_ref 1
credits 2
[ 226.458950] ext4_quota_write: len 24, off 8, type 0 h_ref 1 credits 0
It is clear someone changed handle between v2_write_dquot() and
v2_write_file_info() does anyone have idea what is going on ?
Trace:
[ 226.465409] ------------[ cut here ]------------
[ 226.470006] kernel BUG at fs/jbd2/transaction.c:1086!
[ 226.475038] invalid opcode: 0000 [#1] SMP
[ 226.479134] last sysfs file:
/sys/devices/pci0000:00/0000:00:03.0/0000:03:00.0/0000:04:00.0/0000:05:0
[ 226.494117] CPU 3
[ 226.495944] Modules linked in: sunrpc ipv6 joydev ghes bnx2 pcspkr
serio_raw hed dcdbas i7core_edac e
[ 226.519637]
[ 226.521120] Pid: 2308, comm: setquota Not tainted 2.6.38+ #8 Dell
Inc. PowerEdge R810/05W7DG
[ 226.529553] RIP: 0010:[<ffffffff811c68c2>] [<ffffffff811c68c2>]
jbd2_journal_dirty_metadata+0x50/0xd
[ 226.538843] RSP: 0018:ffff880659563b48 EFLAGS: 00010246
[ 226.544133] RAX: 0000000000000000 RBX: ffff8808771eea80 RCX:
0000000000000000
[ 226.551244] RDX: ffff8808771f5000 RSI: ffff8804772a0b60 RDI:
ffff8804772a0b60
[ 226.558351] RBP: ffff880659563b78 R08: ffff8804772a0b60 R09:
ffff8808771eea80
[ 226.565456] R10: ffff8804766e3d0c R11: ffff880800000000 R12:
ffff88066ff4c000
[ 226.572562] R13: ffff8808771f5000 R14: ffff8804772a0b60 R15:
ffff88086f949400
[ 226.579670] FS: 00007f4d2dc07720(0000) GS:ffff88087f800000(0000)
knlGS:0000000000000000
[ 226.587728] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 226.593449] CR2: 00007f5d18512800 CR3: 00000004763a4000 CR4:
00000000000006e0
[ 226.600554] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[ 226.607659] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[ 226.614766] Process setquota (pid: 2308, threadinfo ffff880659562000,
task ffff88067766ae00)
[ 226.623169] Stack:
[ 226.625169] ffff880659563b68 0000000000000000 ffff8808771f5000
ffff8804772a0b60
[ 226.632571] 0000000000001267 ffffffff8161f7f0 ffff880659563bc8
ffffffff811b35db
[ 226.639973] ffff880659563ba8 ffffffff810691ae ffff8804772a0b60
ffff8808771a0450
[ 226.647377] Call Trace:
[ 226.649818] [<ffffffff811b35db>]
__ext4_handle_dirty_metadata+0xf1/0xfa
[ 226.656498] [<ffffffff810691ae>] ? wake_up_bit+0x25/0x2a
[ 226.661884] [<ffffffff811a7a9b>] ext4_quota_write+0x18d/0x20b
[ 226.667700] [<ffffffff811638d6>] ? qtree_write_dquot+0x114/0x126
[ 226.673771] [<ffffffff811623f9>] v2_write_file_info+0xa7/0xd6
[ 226.679581] [<ffffffff8115f246>] dquot_commit+0xb7/0xcf
[ 226.684871] [<ffffffff811a90d4>] ext4_write_dquot+0x61/0x83
[ 226.690510] [<ffffffff8115fe59>] dqput+0xb5/0x161
[ 226.695282] [<ffffffff81160d2f>] dquot_set_dqblk+0x302/0x312
[ 226.701022] [<ffffffff81163c21>] quota_setquota+0x131/0x153
[ 226.706660] [<ffffffff811eab8f>] ? avc_has_perm+0x5c/0x6e
[ 226.712125] [<ffffffff81164040>] do_quotactl+0x36b/0x47c
[ 226.717509] [<ffffffff8103cc69>] ? need_resched+0x23/0x2d
[ 226.722989] [<ffffffff8103cc81>] ? should_resched+0xe/0x2e
[ 226.728545] [<ffffffff81474bb3>] ? _cond_resched+0xe/0x22
[ 226.734013] [<ffffffff8112f3a0>] ? iput+0x36/0x138
[ 226.738870] [<ffffffff81164279>] sys_quotactl+0x128/0x15b
[ 226.744339] [<ffffffff8147c842>] system_call_fastpath+0x16/0x1b
[ 226.750320] Code: eb ff ff 85 c0 4d 8b 27 0f 85 94 00 00 00 4c 89 f7
e8 97 eb ff ff 83 7b 10 00 75 18
[ 226.769914] RIP [<ffffffff811c68c2>]
jbd2_journal_dirty_metadata+0x50/0xd6
[ 226.776901] RSP <ffff880659563b48>
[ 226.780402] ---[ end trace 6d2d535a183db323 ]---
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: kernel BUG at fs/jbd2/transaction.c:1086!
2011-03-30 15:32 kernel BUG at fs/jbd2/transaction.c:1086! Lukas Czerner
@ 2011-03-31 16:43 ` Jan Kara
2011-03-31 17:14 ` Lukas Czerner
2011-03-31 20:59 ` Lukas Czerner
0 siblings, 2 replies; 8+ messages in thread
From: Jan Kara @ 2011-03-31 16:43 UTC (permalink / raw)
To: Lukas Czerner; +Cc: linux-ext4, jack, tytso, Eric Sandeen
[-- Attachment #1: Type: text/plain, Size: 2377 bytes --]
On Wed 30-03-11 17:32:06, Lukas Czerner wrote:
> Hello,
>
> I have hit BUG_ON while running xfstest 234 in the loop
> on the ext4. Backtrace below .
>
> I thought that this problem was solved with
> 67eeb5685d2a211c0252ac7884142e503c759500 however it is still present.
> I might be a bit hard to hit, but once you run
>
> while true; do sync; sleep 0.3; done
>
> concurrently it is reproducible almost all the time. I have tried to
> understand what is going on but only thing I can see is this course
> of action:
>
> ext4_write_dquot
> ext4_journal_start <- h_buffer_credits = 2
> dquot_commit
> v2_write_dquot
> qtree_write_dquot
> ext4_quota_write
> ext4_handle_dirty_metadata <- this takes one block reserved
> for journal
> h_buffer_credits--
> v2_write_file_info
> ext4_quota_write
> ext4_handle_dirty_metadata <- this takes second block reserved
> for journal
> h_buffer_credits--
> ext4_journal_stop
>
> However apparently I am missing something, because according to the
> backtrace the second call of jbd2_journal_dirty_metadata() may end up
> with BUG_ON -> J_ASSERT_JH(jh, handle->h_buffer_credits > 0);
>
> So someone else is touching our handle apparently, but I go not exacly
> know what is going on. From my simple debugging printk information I
> have put with before ext4_handle_dirty_metadata() in ext4_quota_write()
> we can see:
>
> [ 226.450263] ext4_quota_write: len 48, off 5136, type 0 h_ref 1
> credits 2
> [ 226.458950] ext4_quota_write: len 24, off 8, type 0 h_ref 1 credits 0
>
> It is clear someone changed handle between v2_write_dquot() and
> v2_write_file_info() does anyone have idea what is going on ?
It's simpler than that I believe. ext4_write_dquot() does also
inode->i_mtime = inode->i_ctime = CURRENT_TIME;
ext4_mark_inode_dirty(handle, inode);
and that eats another credit. Looking at the comment before
EXT4_QUOTA_TRANS_BLOCKS, we in fact counted with inode and data update but
didn't count with the info update. It's actually a race that we ended up
doing info update in our transaction - someone marked info dirty and before
he managed to write it, we went in, saw the dirty flag and wrote it for
him. So the attached patch should fix the problem for you.
Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
[-- Attachment #2: 0001-quota-Don-t-write-quota-info-in-dquot_commit.patch --]
[-- Type: text/x-patch, Size: 1980 bytes --]
>From 029defbf639cb228a513a6c056cd601841ebf4ab Mon Sep 17 00:00:00 2001
From: Jan Kara <jack@suse.cz>
Date: Thu, 31 Mar 2011 18:36:52 +0200
Subject: [PATCH] quota: Don't write quota info in dquot_commit()
There's no reason to write quota info in dquot_commit(). The writing is a
relict from the old days when we didn't have dquot_acquire() and
dquot_release() and thus dquot_commit() could have created / removed quota
structures from the file. These days dquot_commit() only updates usage counters
/ limits in quota structure and thus there's no need to write quota info.
This also fixes an issue with journaling filesystem which didn't reserve
enough space in the transaction for write of quota info (it could have been
dirty at the time of dquot_commit() because of a race with other operation
changing it).
Reported-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
fs/quota/dquot.c | 13 ++++---------
1 files changed, 4 insertions(+), 9 deletions(-)
diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
index a2a622e..b59ee61 100644
--- a/fs/quota/dquot.c
+++ b/fs/quota/dquot.c
@@ -442,7 +442,7 @@ EXPORT_SYMBOL(dquot_acquire);
*/
int dquot_commit(struct dquot *dquot)
{
- int ret = 0, ret2 = 0;
+ int ret = 0;
struct quota_info *dqopt = sb_dqopt(dquot->dq_sb);
mutex_lock(&dqopt->dqio_mutex);
@@ -454,15 +454,10 @@ int dquot_commit(struct dquot *dquot)
spin_unlock(&dq_list_lock);
/* Inactive dquot can be only if there was error during read/init
* => we have better not writing it */
- if (test_bit(DQ_ACTIVE_B, &dquot->dq_flags)) {
+ if (test_bit(DQ_ACTIVE_B, &dquot->dq_flags))
ret = dqopt->ops[dquot->dq_type]->commit_dqblk(dquot);
- if (info_dirty(&dqopt->info[dquot->dq_type])) {
- ret2 = dqopt->ops[dquot->dq_type]->write_file_info(
- dquot->dq_sb, dquot->dq_type);
- }
- if (ret >= 0)
- ret = ret2;
- }
+ else
+ ret = -EIO;
out_sem:
mutex_unlock(&dqopt->dqio_mutex);
return ret;
--
1.7.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: kernel BUG at fs/jbd2/transaction.c:1086!
2011-03-31 16:43 ` Jan Kara
@ 2011-03-31 17:14 ` Lukas Czerner
2011-03-31 20:59 ` Lukas Czerner
1 sibling, 0 replies; 8+ messages in thread
From: Lukas Czerner @ 2011-03-31 17:14 UTC (permalink / raw)
To: Jan Kara; +Cc: Lukas Czerner, linux-ext4, tytso, Eric Sandeen
On Thu, 31 Mar 2011, Jan Kara wrote:
> On Wed 30-03-11 17:32:06, Lukas Czerner wrote:
> > Hello,
> >
> > I have hit BUG_ON while running xfstest 234 in the loop
> > on the ext4. Backtrace below .
> >
> > I thought that this problem was solved with
> > 67eeb5685d2a211c0252ac7884142e503c759500 however it is still present.
> > I might be a bit hard to hit, but once you run
> >
> > while true; do sync; sleep 0.3; done
> >
> > concurrently it is reproducible almost all the time. I have tried to
> > understand what is going on but only thing I can see is this course
> > of action:
> >
> > ext4_write_dquot
> > ext4_journal_start <- h_buffer_credits = 2
> > dquot_commit
> > v2_write_dquot
> > qtree_write_dquot
> > ext4_quota_write
> > ext4_handle_dirty_metadata <- this takes one block reserved
> > for journal
> > h_buffer_credits--
> > v2_write_file_info
> > ext4_quota_write
> > ext4_handle_dirty_metadata <- this takes second block reserved
> > for journal
> > h_buffer_credits--
> > ext4_journal_stop
> >
> > However apparently I am missing something, because according to the
> > backtrace the second call of jbd2_journal_dirty_metadata() may end up
> > with BUG_ON -> J_ASSERT_JH(jh, handle->h_buffer_credits > 0);
> >
> > So someone else is touching our handle apparently, but I go not exacly
> > know what is going on. From my simple debugging printk information I
> > have put with before ext4_handle_dirty_metadata() in ext4_quota_write()
> > we can see:
> >
> > [ 226.450263] ext4_quota_write: len 48, off 5136, type 0 h_ref 1
> > credits 2
> > [ 226.458950] ext4_quota_write: len 24, off 8, type 0 h_ref 1 credits 0
> >
> > It is clear someone changed handle between v2_write_dquot() and
> > v2_write_file_info() does anyone have idea what is going on ?
> It's simpler than that I believe. ext4_write_dquot() does also
> inode->i_mtime = inode->i_ctime = CURRENT_TIME;
> ext4_mark_inode_dirty(handle, inode);
> and that eats another credit. Looking at the comment before
> EXT4_QUOTA_TRANS_BLOCKS, we in fact counted with inode and data update but
> didn't count with the info update. It's actually a race that we ended up
> doing info update in our transaction - someone marked info dirty and before
> he managed to write it, we went in, saw the dirty flag and wrote it for
> him. So the attached patch should fix the problem for you.
>
> Honza
>
Hrm, I have been cracking my head with this bug and miss the obvious
ext4_mark_inode_dirty(). Thanks Honzo, I'll test the patch and let you
know.
Thanks!
-Lukas
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: kernel BUG at fs/jbd2/transaction.c:1086!
2011-03-31 16:43 ` Jan Kara
2011-03-31 17:14 ` Lukas Czerner
@ 2011-03-31 20:59 ` Lukas Czerner
2011-03-31 22:23 ` Jan Kara
1 sibling, 1 reply; 8+ messages in thread
From: Lukas Czerner @ 2011-03-31 20:59 UTC (permalink / raw)
To: Jan Kara; +Cc: Lukas Czerner, linux-ext4, tytso, Eric Sandeen
On Thu, 31 Mar 2011, Jan Kara wrote:
> On Wed 30-03-11 17:32:06, Lukas Czerner wrote:
> > Hello,
> >
> > I have hit BUG_ON while running xfstest 234 in the loop
> > on the ext4. Backtrace below .
> >
> > I thought that this problem was solved with
> > 67eeb5685d2a211c0252ac7884142e503c759500 however it is still present.
> > I might be a bit hard to hit, but once you run
> >
> > while true; do sync; sleep 0.3; done
> >
> > concurrently it is reproducible almost all the time. I have tried to
> > understand what is going on but only thing I can see is this course
> > of action:
> >
> > ext4_write_dquot
> > ext4_journal_start <- h_buffer_credits = 2
> > dquot_commit
> > v2_write_dquot
> > qtree_write_dquot
> > ext4_quota_write
> > ext4_handle_dirty_metadata <- this takes one block reserved
> > for journal
> > h_buffer_credits--
> > v2_write_file_info
> > ext4_quota_write
> > ext4_handle_dirty_metadata <- this takes second block reserved
> > for journal
> > h_buffer_credits--
> > ext4_journal_stop
> >
> > However apparently I am missing something, because according to the
> > backtrace the second call of jbd2_journal_dirty_metadata() may end up
> > with BUG_ON -> J_ASSERT_JH(jh, handle->h_buffer_credits > 0);
> >
> > So someone else is touching our handle apparently, but I go not exacly
> > know what is going on. From my simple debugging printk information I
> > have put with before ext4_handle_dirty_metadata() in ext4_quota_write()
> > we can see:
> >
> > [ 226.450263] ext4_quota_write: len 48, off 5136, type 0 h_ref 1
> > credits 2
> > [ 226.458950] ext4_quota_write: len 24, off 8, type 0 h_ref 1 credits 0
> >
> > It is clear someone changed handle between v2_write_dquot() and
> > v2_write_file_info() does anyone have idea what is going on ?
> It's simpler than that I believe. ext4_write_dquot() does also
> inode->i_mtime = inode->i_ctime = CURRENT_TIME;
> ext4_mark_inode_dirty(handle, inode);
> and that eats another credit. Looking at the comment before
> EXT4_QUOTA_TRANS_BLOCKS, we in fact counted with inode and data update but
> didn't count with the info update. It's actually a race that we ended up
> doing info update in our transaction - someone marked info dirty and before
> he managed to write it, we went in, saw the dirty flag and wrote it for
> him. So the attached patch should fix the problem for you.
>
> Honza
>
The patch fixes the problem, you can add
Reported-and-tested-by: Lukas Czerner <lczerner@redhat.com>
Ted, could you pick this up ASAP ? Also I think this needs to go into
stable as well.
Thanks!
-Lukas
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: kernel BUG at fs/jbd2/transaction.c:1086!
2011-03-31 20:59 ` Lukas Czerner
@ 2011-03-31 22:23 ` Jan Kara
2011-04-01 9:54 ` Amir Goldstein
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Jan Kara @ 2011-03-31 22:23 UTC (permalink / raw)
To: Lukas Czerner; +Cc: Jan Kara, linux-ext4, tytso, Eric Sandeen
[-- Attachment #1: Type: text/plain, Size: 5884 bytes --]
On Thu 31-03-11 22:59:24, Lukas Czerner wrote:
> On Thu, 31 Mar 2011, Jan Kara wrote:
>
> > On Wed 30-03-11 17:32:06, Lukas Czerner wrote:
> > > Hello,
> > >
> > > I have hit BUG_ON while running xfstest 234 in the loop
> > > on the ext4. Backtrace below .
> > >
> > > I thought that this problem was solved with
> > > 67eeb5685d2a211c0252ac7884142e503c759500 however it is still present.
> > > I might be a bit hard to hit, but once you run
> > >
> > > while true; do sync; sleep 0.3; done
> > >
> > > concurrently it is reproducible almost all the time. I have tried to
> > > understand what is going on but only thing I can see is this course
> > > of action:
> > >
> > > ext4_write_dquot
> > > ext4_journal_start <- h_buffer_credits = 2
> > > dquot_commit
> > > v2_write_dquot
> > > qtree_write_dquot
> > > ext4_quota_write
> > > ext4_handle_dirty_metadata <- this takes one block reserved
> > > for journal
> > > h_buffer_credits--
> > > v2_write_file_info
> > > ext4_quota_write
> > > ext4_handle_dirty_metadata <- this takes second block reserved
> > > for journal
> > > h_buffer_credits--
> > > ext4_journal_stop
> > >
> > > However apparently I am missing something, because according to the
> > > backtrace the second call of jbd2_journal_dirty_metadata() may end up
> > > with BUG_ON -> J_ASSERT_JH(jh, handle->h_buffer_credits > 0);
> > >
> > > So someone else is touching our handle apparently, but I go not exacly
> > > know what is going on. From my simple debugging printk information I
> > > have put with before ext4_handle_dirty_metadata() in ext4_quota_write()
> > > we can see:
> > >
> > > [ 226.450263] ext4_quota_write: len 48, off 5136, type 0 h_ref 1
> > > credits 2
> > > [ 226.458950] ext4_quota_write: len 24, off 8, type 0 h_ref 1 credits 0
> > >
> > > It is clear someone changed handle between v2_write_dquot() and
> > > v2_write_file_info() does anyone have idea what is going on ?
> > It's simpler than that I believe. ext4_write_dquot() does also
> > inode->i_mtime = inode->i_ctime = CURRENT_TIME;
> > ext4_mark_inode_dirty(handle, inode);
> > and that eats another credit. Looking at the comment before
> > EXT4_QUOTA_TRANS_BLOCKS, we in fact counted with inode and data update but
> > didn't count with the info update. It's actually a race that we ended up
> > doing info update in our transaction - someone marked info dirty and before
> > he managed to write it, we went in, saw the dirty flag and wrote it for
> > him. So the attached patch should fix the problem for you.
> >
>
> The patch fixes the problem, you can add
>
> Reported-and-tested-by: Lukas Czerner <lczerner@redhat.com>
>
> Ted, could you pick this up ASAP ? Also I think this needs to go into
> stable as well.
I'll merge it from my tree next week, OK? I don't think there's so much
hurry as the bug has been there for years and you're the first one to
report it :).
BTW, I've also created the patch below which can slightly improve
performance of quotas on ext4. The modification of the inode was
unnecessary as well... Ted, can you merge this patch? It's independent from
the previous one.
Honza
--
>From 51741760fe3682ea37bd264471f897bda9d5bd6d Mon Sep 17 00:00:00 2001
From: Jan Kara <jack@suse.cz>
Date: Thu, 31 Mar 2011 23:12:14 +0200
Subject: [PATCH] ext4: Remove unnecessary [cm]time update of quota file
It is not necessary to update [cm]time of quota file on each quota file
write and it wastes journal space and IO throughput with inode writes. So
just remove the updating from ext4_quota_write() and only update times when
quotas are being turned off. Userspace cannot get anything reliable from
quota files while they are used by the kernel anyway.
Signed-off-by: Jan Kara <jack@suse.cz>
---
fs/ext4/ext4_jbd2.h | 4 ++--
fs/ext4/super.c | 16 ++++++++++++++--
2 files changed, 16 insertions(+), 4 deletions(-)
diff --git a/fs/ext4/ext4_jbd2.h b/fs/ext4/ext4_jbd2.h
index d8b992e..3163583 100644
--- a/fs/ext4/ext4_jbd2.h
+++ b/fs/ext4/ext4_jbd2.h
@@ -86,8 +86,8 @@
#ifdef CONFIG_QUOTA
/* Amount of blocks needed for quota update - we know that the structure was
- * allocated so we need to update only inode+data */
-#define EXT4_QUOTA_TRANS_BLOCKS(sb) (test_opt(sb, QUOTA) ? 2 : 0)
+ * allocated so we need to update only data block */
+#define EXT4_QUOTA_TRANS_BLOCKS(sb) (test_opt(sb, QUOTA) ? 1 : 0)
/* Amount of blocks needed for quota insert/delete - we do some block writes
* but inode, sb and group updates are done only once */
#define EXT4_QUOTA_INIT_BLOCKS(sb) (test_opt(sb, QUOTA) ? (DQUOT_INIT_ALLOC*\
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 1f0acd4..baee1b4 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -4608,11 +4608,24 @@ static int ext4_quota_on(struct super_block *sb, int type, int format_id,
static int ext4_quota_off(struct super_block *sb, int type)
{
+ struct inode *inode = sb_dqopt(sb)->files[type];
+ handle_t *handle;
+
/* Force all delayed allocation blocks to be allocated.
* Caller already holds s_umount sem */
if (test_opt(sb, DELALLOC))
sync_filesystem(sb);
+ /* Update modification times of quota files when userspace can
+ * start looking at them */
+ handle = ext4_journal_start(inode, 1);
+ if (IS_ERR(handle))
+ goto out;
+ inode->i_mtime = inode->i_ctime = CURRENT_TIME;
+ ext4_mark_inode_dirty(handle, inode);
+ ext4_journal_stop(handle);
+
+out:
return dquot_quota_off(sb, type);
}
@@ -4708,9 +4721,8 @@ out:
if (inode->i_size < off + len) {
i_size_write(inode, off + len);
EXT4_I(inode)->i_disksize = inode->i_size;
+ ext4_mark_inode_dirty(handle, inode);
}
- inode->i_mtime = inode->i_ctime = CURRENT_TIME;
- ext4_mark_inode_dirty(handle, inode);
mutex_unlock(&inode->i_mutex);
return len;
}
--
1.7.1
[-- Attachment #2: 0001-ext4-Remove-unnecessary-cm-time-update-of-quota-file.patch --]
[-- Type: text/x-patch, Size: 2641 bytes --]
>From 51741760fe3682ea37bd264471f897bda9d5bd6d Mon Sep 17 00:00:00 2001
From: Jan Kara <jack@suse.cz>
Date: Thu, 31 Mar 2011 23:12:14 +0200
Subject: [PATCH] ext4: Remove unnecessary [cm]time update of quota file
It is not necessary to update [cm]time of quota file on each quota file
write and it wastes journal space and IO throughput with inode writes. So
just remove the updating from ext4_quota_write() and only update times when
quotas are being turned off. Userspace cannot get anything reliable from
quota files while they are used by the kernel anyway.
Signed-off-by: Jan Kara <jack@suse.cz>
---
fs/ext4/ext4_jbd2.h | 4 ++--
fs/ext4/super.c | 16 ++++++++++++++--
2 files changed, 16 insertions(+), 4 deletions(-)
diff --git a/fs/ext4/ext4_jbd2.h b/fs/ext4/ext4_jbd2.h
index d8b992e..3163583 100644
--- a/fs/ext4/ext4_jbd2.h
+++ b/fs/ext4/ext4_jbd2.h
@@ -86,8 +86,8 @@
#ifdef CONFIG_QUOTA
/* Amount of blocks needed for quota update - we know that the structure was
- * allocated so we need to update only inode+data */
-#define EXT4_QUOTA_TRANS_BLOCKS(sb) (test_opt(sb, QUOTA) ? 2 : 0)
+ * allocated so we need to update only data block */
+#define EXT4_QUOTA_TRANS_BLOCKS(sb) (test_opt(sb, QUOTA) ? 1 : 0)
/* Amount of blocks needed for quota insert/delete - we do some block writes
* but inode, sb and group updates are done only once */
#define EXT4_QUOTA_INIT_BLOCKS(sb) (test_opt(sb, QUOTA) ? (DQUOT_INIT_ALLOC*\
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 1f0acd4..baee1b4 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -4608,11 +4608,24 @@ static int ext4_quota_on(struct super_block *sb, int type, int format_id,
static int ext4_quota_off(struct super_block *sb, int type)
{
+ struct inode *inode = sb_dqopt(sb)->files[type];
+ handle_t *handle;
+
/* Force all delayed allocation blocks to be allocated.
* Caller already holds s_umount sem */
if (test_opt(sb, DELALLOC))
sync_filesystem(sb);
+ /* Update modification times of quota files when userspace can
+ * start looking at them */
+ handle = ext4_journal_start(inode, 1);
+ if (IS_ERR(handle))
+ goto out;
+ inode->i_mtime = inode->i_ctime = CURRENT_TIME;
+ ext4_mark_inode_dirty(handle, inode);
+ ext4_journal_stop(handle);
+
+out:
return dquot_quota_off(sb, type);
}
@@ -4708,9 +4721,8 @@ out:
if (inode->i_size < off + len) {
i_size_write(inode, off + len);
EXT4_I(inode)->i_disksize = inode->i_size;
+ ext4_mark_inode_dirty(handle, inode);
}
- inode->i_mtime = inode->i_ctime = CURRENT_TIME;
- ext4_mark_inode_dirty(handle, inode);
mutex_unlock(&inode->i_mutex);
return len;
}
--
1.7.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: kernel BUG at fs/jbd2/transaction.c:1086!
2011-03-31 22:23 ` Jan Kara
@ 2011-04-01 9:54 ` Amir Goldstein
2011-04-01 20:05 ` Lukas Czerner
2011-04-04 17:25 ` Ted Ts'o
2 siblings, 0 replies; 8+ messages in thread
From: Amir Goldstein @ 2011-04-01 9:54 UTC (permalink / raw)
To: Jan Kara, Yongqiang Yang; +Cc: Lukas Czerner, linux-ext4, tytso, Eric Sandeen
On Thu, Mar 31, 2011 at 3:23 PM, Jan Kara <jack@suse.cz> wrote:
> On Thu 31-03-11 22:59:24, Lukas Czerner wrote:
>> On Thu, 31 Mar 2011, Jan Kara wrote:
>>
>> > On Wed 30-03-11 17:32:06, Lukas Czerner wrote:
>> > > Hello,
>> > >
>> > > I have hit BUG_ON while running xfstest 234 in the loop
>> > > on the ext4. Backtrace below .
>> > >
>> > > I thought that this problem was solved with
>> > > 67eeb5685d2a211c0252ac7884142e503c759500 however it is still present.
>> > > I might be a bit hard to hit, but once you run
>> > >
>> > > while true; do sync; sleep 0.3; done
>> > >
>> > > concurrently it is reproducible almost all the time. I have tried to
>> > > understand what is going on but only thing I can see is this course
>> > > of action:
>> > >
>> > > ext4_write_dquot
>> > > ext4_journal_start <- h_buffer_credits = 2
>> > > dquot_commit
>> > > v2_write_dquot
>> > > qtree_write_dquot
>> > > ext4_quota_write
>> > > ext4_handle_dirty_metadata <- this takes one block reserved
>> > > for journal
>> > > h_buffer_credits--
>> > > v2_write_file_info
>> > > ext4_quota_write
>> > > ext4_handle_dirty_metadata <- this takes second block reserved
>> > > for journal
>> > > h_buffer_credits--
>> > > ext4_journal_stop
>> > >
>> > > However apparently I am missing something, because according to the
>> > > backtrace the second call of jbd2_journal_dirty_metadata() may end up
>> > > with BUG_ON -> J_ASSERT_JH(jh, handle->h_buffer_credits > 0);
>> > >
>> > > So someone else is touching our handle apparently, but I go not exacly
>> > > know what is going on. From my simple debugging printk information I
>> > > have put with before ext4_handle_dirty_metadata() in ext4_quota_write()
>> > > we can see:
>> > >
>> > > [ 226.450263] ext4_quota_write: len 48, off 5136, type 0 h_ref 1
>> > > credits 2
>> > > [ 226.458950] ext4_quota_write: len 24, off 8, type 0 h_ref 1 credits 0
>> > >
>> > > It is clear someone changed handle between v2_write_dquot() and
>> > > v2_write_file_info() does anyone have idea what is going on ?
>> > It's simpler than that I believe. ext4_write_dquot() does also
>> > inode->i_mtime = inode->i_ctime = CURRENT_TIME;
>> > ext4_mark_inode_dirty(handle, inode);
>> > and that eats another credit. Looking at the comment before
>> > EXT4_QUOTA_TRANS_BLOCKS, we in fact counted with inode and data update but
>> > didn't count with the info update. It's actually a race that we ended up
>> > doing info update in our transaction - someone marked info dirty and before
>> > he managed to write it, we went in, saw the dirty flag and wrote it for
>> > him. So the attached patch should fix the problem for you.
>> >
>>
>> The patch fixes the problem, you can add
>>
>> Reported-and-tested-by: Lukas Czerner <lczerner@redhat.com>
>>
>> Ted, could you pick this up ASAP ? Also I think this needs to go into
>> stable as well.
> I'll merge it from my tree next week, OK? I don't think there's so much
> hurry as the bug has been there for years and you're the first one to
> report it :).
>
> BTW, I've also created the patch below which can slightly improve
> performance of quotas on ext4. The modification of the inode was
> unnecessary as well... Ted, can you merge this patch? It's independent from
> the previous one.
>
> Honza
> --
> From 51741760fe3682ea37bd264471f897bda9d5bd6d Mon Sep 17 00:00:00 2001
> From: Jan Kara <jack@suse.cz>
> Date: Thu, 31 Mar 2011 23:12:14 +0200
> Subject: [PATCH] ext4: Remove unnecessary [cm]time update of quota file
>
> It is not necessary to update [cm]time of quota file on each quota file
> write and it wastes journal space and IO throughput with inode writes. So
> just remove the updating from ext4_quota_write() and only update times when
> quotas are being turned off. Userspace cannot get anything reliable from
> quota files while they are used by the kernel anyway.
>
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
> fs/ext4/ext4_jbd2.h | 4 ++--
> fs/ext4/super.c | 16 ++++++++++++++--
> 2 files changed, 16 insertions(+), 4 deletions(-)
>
> diff --git a/fs/ext4/ext4_jbd2.h b/fs/ext4/ext4_jbd2.h
> index d8b992e..3163583 100644
> --- a/fs/ext4/ext4_jbd2.h
> +++ b/fs/ext4/ext4_jbd2.h
> @@ -86,8 +86,8 @@
>
> #ifdef CONFIG_QUOTA
> /* Amount of blocks needed for quota update - we know that the structure was
> - * allocated so we need to update only inode+data */
> -#define EXT4_QUOTA_TRANS_BLOCKS(sb) (test_opt(sb, QUOTA) ? 2 : 0)
> + * allocated so we need to update only data block */
> +#define EXT4_QUOTA_TRANS_BLOCKS(sb) (test_opt(sb, QUOTA) ? 1 : 0)
Cool :-)
Yongqiang,
Now we only need to account for 2 (user+group) snapshot quota credits for COW.
> /* Amount of blocks needed for quota insert/delete - we do some block writes
> * but inode, sb and group updates are done only once */
> #define EXT4_QUOTA_INIT_BLOCKS(sb) (test_opt(sb, QUOTA) ? (DQUOT_INIT_ALLOC*\
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index 1f0acd4..baee1b4 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -4608,11 +4608,24 @@ static int ext4_quota_on(struct super_block *sb, int type, int format_id,
>
> static int ext4_quota_off(struct super_block *sb, int type)
> {
> + struct inode *inode = sb_dqopt(sb)->files[type];
> + handle_t *handle;
> +
> /* Force all delayed allocation blocks to be allocated.
> * Caller already holds s_umount sem */
> if (test_opt(sb, DELALLOC))
> sync_filesystem(sb);
>
> + /* Update modification times of quota files when userspace can
> + * start looking at them */
> + handle = ext4_journal_start(inode, 1);
> + if (IS_ERR(handle))
> + goto out;
> + inode->i_mtime = inode->i_ctime = CURRENT_TIME;
> + ext4_mark_inode_dirty(handle, inode);
> + ext4_journal_stop(handle);
> +
> +out:
> return dquot_quota_off(sb, type);
> }
>
> @@ -4708,9 +4721,8 @@ out:
> if (inode->i_size < off + len) {
> i_size_write(inode, off + len);
> EXT4_I(inode)->i_disksize = inode->i_size;
> + ext4_mark_inode_dirty(handle, inode);
> }
> - inode->i_mtime = inode->i_ctime = CURRENT_TIME;
> - ext4_mark_inode_dirty(handle, inode);
> mutex_unlock(&inode->i_mutex);
> return len;
> }
> --
> 1.7.1
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: kernel BUG at fs/jbd2/transaction.c:1086!
2011-03-31 22:23 ` Jan Kara
2011-04-01 9:54 ` Amir Goldstein
@ 2011-04-01 20:05 ` Lukas Czerner
2011-04-04 17:25 ` Ted Ts'o
2 siblings, 0 replies; 8+ messages in thread
From: Lukas Czerner @ 2011-04-01 20:05 UTC (permalink / raw)
To: Jan Kara; +Cc: Lukas Czerner, linux-ext4, tytso, Eric Sandeen
On Fri, 1 Apr 2011, Jan Kara wrote:
> On Thu 31-03-11 22:59:24, Lukas Czerner wrote:
> > On Thu, 31 Mar 2011, Jan Kara wrote:
> >
> > > On Wed 30-03-11 17:32:06, Lukas Czerner wrote:
> > > > Hello,
> > > >
> > > > I have hit BUG_ON while running xfstest 234 in the loop
> > > > on the ext4. Backtrace below .
> > > >
> > > > I thought that this problem was solved with
> > > > 67eeb5685d2a211c0252ac7884142e503c759500 however it is still present.
> > > > I might be a bit hard to hit, but once you run
> > > >
> > > > while true; do sync; sleep 0.3; done
> > > >
> > > > concurrently it is reproducible almost all the time. I have tried to
> > > > understand what is going on but only thing I can see is this course
> > > > of action:
> > > >
> > > > ext4_write_dquot
> > > > ext4_journal_start <- h_buffer_credits = 2
> > > > dquot_commit
> > > > v2_write_dquot
> > > > qtree_write_dquot
> > > > ext4_quota_write
> > > > ext4_handle_dirty_metadata <- this takes one block reserved
> > > > for journal
> > > > h_buffer_credits--
> > > > v2_write_file_info
> > > > ext4_quota_write
> > > > ext4_handle_dirty_metadata <- this takes second block reserved
> > > > for journal
> > > > h_buffer_credits--
> > > > ext4_journal_stop
> > > >
> > > > However apparently I am missing something, because according to the
> > > > backtrace the second call of jbd2_journal_dirty_metadata() may end up
> > > > with BUG_ON -> J_ASSERT_JH(jh, handle->h_buffer_credits > 0);
> > > >
> > > > So someone else is touching our handle apparently, but I go not exacly
> > > > know what is going on. From my simple debugging printk information I
> > > > have put with before ext4_handle_dirty_metadata() in ext4_quota_write()
> > > > we can see:
> > > >
> > > > [ 226.450263] ext4_quota_write: len 48, off 5136, type 0 h_ref 1
> > > > credits 2
> > > > [ 226.458950] ext4_quota_write: len 24, off 8, type 0 h_ref 1 credits 0
> > > >
> > > > It is clear someone changed handle between v2_write_dquot() and
> > > > v2_write_file_info() does anyone have idea what is going on ?
> > > It's simpler than that I believe. ext4_write_dquot() does also
> > > inode->i_mtime = inode->i_ctime = CURRENT_TIME;
> > > ext4_mark_inode_dirty(handle, inode);
> > > and that eats another credit. Looking at the comment before
> > > EXT4_QUOTA_TRANS_BLOCKS, we in fact counted with inode and data update but
> > > didn't count with the info update. It's actually a race that we ended up
> > > doing info update in our transaction - someone marked info dirty and before
> > > he managed to write it, we went in, saw the dirty flag and wrote it for
> > > him. So the attached patch should fix the problem for you.
> > >
> >
> > The patch fixes the problem, you can add
> >
> > Reported-and-tested-by: Lukas Czerner <lczerner@redhat.com>
> >
> > Ted, could you pick this up ASAP ? Also I think this needs to go into
> > stable as well.
> I'll merge it from my tree next week, OK? I don't think there's so much
> hurry as the bug has been there for years and you're the first one to
> report it :).
Of course :) sometimes I am a bit hasty.
>
> BTW, I've also created the patch below which can slightly improve
> performance of quotas on ext4. The modification of the inode was
> unnecessary as well... Ted, can you merge this patch? It's independent from
> the previous one.
>
> Honza
> --
> From 51741760fe3682ea37bd264471f897bda9d5bd6d Mon Sep 17 00:00:00 2001
> From: Jan Kara <jack@suse.cz>
> Date: Thu, 31 Mar 2011 23:12:14 +0200
> Subject: [PATCH] ext4: Remove unnecessary [cm]time update of quota file
>
> It is not necessary to update [cm]time of quota file on each quota file
> write and it wastes journal space and IO throughput with inode writes. So
> just remove the updating from ext4_quota_write() and only update times when
> quotas are being turned off. Userspace cannot get anything reliable from
> quota files while they are used by the kernel anyway.
>
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
> fs/ext4/ext4_jbd2.h | 4 ++--
> fs/ext4/super.c | 16 ++++++++++++++--
> 2 files changed, 16 insertions(+), 4 deletions(-)
>
> diff --git a/fs/ext4/ext4_jbd2.h b/fs/ext4/ext4_jbd2.h
> index d8b992e..3163583 100644
> --- a/fs/ext4/ext4_jbd2.h
> +++ b/fs/ext4/ext4_jbd2.h
> @@ -86,8 +86,8 @@
>
> #ifdef CONFIG_QUOTA
> /* Amount of blocks needed for quota update - we know that the structure was
> - * allocated so we need to update only inode+data */
> -#define EXT4_QUOTA_TRANS_BLOCKS(sb) (test_opt(sb, QUOTA) ? 2 : 0)
> + * allocated so we need to update only data block */
> +#define EXT4_QUOTA_TRANS_BLOCKS(sb) (test_opt(sb, QUOTA) ? 1 : 0)
> /* Amount of blocks needed for quota insert/delete - we do some block writes
> * but inode, sb and group updates are done only once */
> #define EXT4_QUOTA_INIT_BLOCKS(sb) (test_opt(sb, QUOTA) ? (DQUOT_INIT_ALLOC*\
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index 1f0acd4..baee1b4 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -4608,11 +4608,24 @@ static int ext4_quota_on(struct super_block *sb, int type, int format_id,
>
> static int ext4_quota_off(struct super_block *sb, int type)
> {
> + struct inode *inode = sb_dqopt(sb)->files[type];
> + handle_t *handle;
> +
> /* Force all delayed allocation blocks to be allocated.
> * Caller already holds s_umount sem */
> if (test_opt(sb, DELALLOC))
> sync_filesystem(sb);
>
> + /* Update modification times of quota files when userspace can
> + * start looking at them */
> + handle = ext4_journal_start(inode, 1);
> + if (IS_ERR(handle))
> + goto out;
> + inode->i_mtime = inode->i_ctime = CURRENT_TIME;
> + ext4_mark_inode_dirty(handle, inode);
> + ext4_journal_stop(handle);
> +
> +out:
> return dquot_quota_off(sb, type);
> }
>
> @@ -4708,9 +4721,8 @@ out:
> if (inode->i_size < off + len) {
> i_size_write(inode, off + len);
> EXT4_I(inode)->i_disksize = inode->i_size;
> + ext4_mark_inode_dirty(handle, inode);
> }
> - inode->i_mtime = inode->i_ctime = CURRENT_TIME;
> - ext4_mark_inode_dirty(handle, inode);
> mutex_unlock(&inode->i_mutex);
> return len;
> }
>
--
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: kernel BUG at fs/jbd2/transaction.c:1086!
2011-03-31 22:23 ` Jan Kara
2011-04-01 9:54 ` Amir Goldstein
2011-04-01 20:05 ` Lukas Czerner
@ 2011-04-04 17:25 ` Ted Ts'o
2 siblings, 0 replies; 8+ messages in thread
From: Ted Ts'o @ 2011-04-04 17:25 UTC (permalink / raw)
To: Jan Kara; +Cc: Lukas Czerner, linux-ext4, Eric Sandeen
On Fri, Apr 01, 2011 at 12:23:06AM +0200, Jan Kara wrote:
> > Ted, could you pick this up ASAP ? Also I think this needs to go into
> > stable as well.
> I'll merge it from my tree next week, OK? I don't think there's so much
> hurry as the bug has been there for years and you're the first one to
> report it :).
Agreed, your first patch was quota specific and I'd much rather have
it go through the quota tree.
> BTW, I've also created the patch below which can slightly improve
> performance of quotas on ext4. The modification of the inode was
> unnecessary as well... Ted, can you merge this patch? It's independent from
> the previous one.
Yup, I've added to the ext4 patch queue.
- Ted
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2011-04-04 17:25 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-03-30 15:32 kernel BUG at fs/jbd2/transaction.c:1086! Lukas Czerner
2011-03-31 16:43 ` Jan Kara
2011-03-31 17:14 ` Lukas Czerner
2011-03-31 20:59 ` Lukas Czerner
2011-03-31 22:23 ` Jan Kara
2011-04-01 9:54 ` Amir Goldstein
2011-04-01 20:05 ` Lukas Czerner
2011-04-04 17:25 ` Ted Ts'o
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).