* [BUG] journal handler reference count breaked and fs deadlocked
@ 2003-11-09 21:34 Alex Lyashkov
2003-11-10 9:12 ` Alex Tomas
2003-11-10 11:13 ` Jan Kara
0 siblings, 2 replies; 8+ messages in thread
From: Alex Lyashkov @ 2003-11-09 21:34 UTC (permalink / raw)
To: Jan Kara; +Cc: Andrew Morton, linux-kernel, Herbert Poetzl
Hello All
I try locate what are point where fs deadlocked.
after recompile kernel with debug jbd and set debug level to 100 i log kernel
via serial console
after deadlock - i see in log
==============
journal.c, 581): log_start_commit: JBD: requesting commit 501252/501251
(journal.c, 608): log_wait_commit: JBD: want 501252, j_commit_sequence=501251
(journal.c, 263): kjournald: kjournald wakes
(journal.c, 238): kjournald: commit_sequence=501251, commit_request=501252
(journal.c, 242): kjournald: OK, requests differ
(commit.c, 81): journal_commit_transaction: JBD: starting commit of
transaction 501252
(commit.c, 87): journal_commit_transaction: wait updates.......
(transaction.c, 567): do_get_write_access: buffer_head c79f2e70, force_copy 0
(revoke.c, 375): journal_cancel_revoke: journal_head c79f2e70, cancelling
revoke
(transaction.c, 567): do_get_write_access: buffer_head c79f2e70, force_copy 0
(revoke.c, 375): journal_cancel_revoke: journal_head c79f2e70, cancelling
revoke
(transaction.c, 1104): journal_dirty_metadata: journal_head c79f2e70
(transaction.c, 1392): journal_stop: h_ref 2 -> 1
==============
i think it`s reason fs deadlocked, because wait query not be waked :-\
if i right - it very big problem on ext3..
other logs\infos can be created after request....
--
With best regards,
Alex
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [BUG] journal handler reference count breaked and fs deadlocked
2003-11-09 21:34 [BUG] journal handler reference count breaked and fs deadlocked Alex Lyashkov
@ 2003-11-10 9:12 ` Alex Tomas
2003-11-10 9:55 ` Alex Lyashkov
2003-11-10 11:13 ` Jan Kara
1 sibling, 1 reply; 8+ messages in thread
From: Alex Tomas @ 2003-11-10 9:12 UTC (permalink / raw)
To: Alex Lyashkov; +Cc: Jan Kara, Andrew Morton, linux-kernel, Herbert Poetzl
what system did that time? mount options?
On Sun, 9 Nov 2003 23:34:00 +0200
Alex Lyashkov <shadow@itt.net.ru> wrote:
> Hello All
>
> I try locate what are point where fs deadlocked.
> after recompile kernel with debug jbd and set debug level to 100 i log kernel
> via serial console
> after deadlock - i see in log
> ==============
> journal.c, 581): log_start_commit: JBD: requesting commit 501252/501251
> (journal.c, 608): log_wait_commit: JBD: want 501252, j_commit_sequence=501251
> (journal.c, 263): kjournald: kjournald wakes
> (journal.c, 238): kjournald: commit_sequence=501251, commit_request=501252
> (journal.c, 242): kjournald: OK, requests differ
> (commit.c, 81): journal_commit_transaction: JBD: starting commit of
> transaction 501252
> (commit.c, 87): journal_commit_transaction: wait updates.......
> (transaction.c, 567): do_get_write_access: buffer_head c79f2e70, force_copy 0
> (revoke.c, 375): journal_cancel_revoke: journal_head c79f2e70, cancelling
> revoke
> (transaction.c, 567): do_get_write_access: buffer_head c79f2e70, force_copy 0
> (revoke.c, 375): journal_cancel_revoke: journal_head c79f2e70, cancelling
> revoke
> (transaction.c, 1104): journal_dirty_metadata: journal_head c79f2e70
> (transaction.c, 1392): journal_stop: h_ref 2 -> 1
> ==============
> i think it`s reason fs deadlocked, because wait query not be waked :-\
> if i right - it very big problem on ext3..
> other logs\infos can be created after request....
>
>
> --
> With best regards,
> Alex
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [BUG] journal handler reference count breaked and fs deadlocked
2003-11-10 9:12 ` Alex Tomas
@ 2003-11-10 9:55 ` Alex Lyashkov
0 siblings, 0 replies; 8+ messages in thread
From: Alex Lyashkov @ 2003-11-10 9:55 UTC (permalink / raw)
To: Alex Tomas; +Cc: linux-kernel
On Monday 10 November 2003 11:12, Alex Tomas wrote:
> what system did that time? mount options?
see my last message.
at one console - start/stop services
at second console - mount -o remount,usrquota,grpquota / (for only test
syncfs)
--
With best regards,
Alex
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [BUG] journal handler reference count breaked and fs deadlocked
2003-11-09 21:34 [BUG] journal handler reference count breaked and fs deadlocked Alex Lyashkov
2003-11-10 9:12 ` Alex Tomas
@ 2003-11-10 11:13 ` Jan Kara
2003-11-10 11:48 ` Alex Lyashkov
1 sibling, 1 reply; 8+ messages in thread
From: Jan Kara @ 2003-11-10 11:13 UTC (permalink / raw)
To: Alex Lyashkov; +Cc: Andrew Morton, linux-kernel, Herbert Poetzl
[-- Attachment #1: Type: text/plain, Size: 294 bytes --]
Hi,
thanks for tracking. Are you able to reproduce the problem also on
recent vanilla kernels (ie. 2.4.22)? Can you try the vanilla kernel with
the attached patch (it should fix one of possible deadlocks).
Thanks for testing
Honza
--
Jan Kara <jack@suse.cz>
SuSE CR Labs
[-- Attachment #2: quota-2.4.22-deadlockfix.diff --]
[-- Type: text/plain, Size: 3219 bytes --]
diff -ruX ../kerndiffexclude linux-2.4.22-fixstat/fs/dquot.c linux-2.4.22-deadlock/fs/dquot.c
--- linux-2.4.22-fixstat/fs/dquot.c Wed Nov 5 21:21:58 2003
+++ linux-2.4.22-deadlock/fs/dquot.c Mon Nov 10 11:48:47 2003
@@ -396,7 +396,7 @@
if (dquot->dq_flags & DQ_LOCKED)
wait_on_dquot(dquot);
if (dquot_dirty(dquot))
- sb->dq_op->sync_dquot(dquot);
+ sb->dq_op->write_dquot(dquot);
dqput(dquot);
goto restart;
}
@@ -543,7 +543,7 @@
return;
}
if (dquot_dirty(dquot)) {
- commit_dqblk(dquot);
+ dquot->dq_sb->dq_op->write_dquot(dquot);
goto we_slept;
}
@@ -1219,7 +1219,7 @@
free_space: dquot_free_space,
free_inode: dquot_free_inode,
transfer: dquot_transfer,
- sync_dquot: commit_dqblk
+ write_dquot: commit_dqblk
};
/* Function used by filesystems for initializing the dquot_operations structure */
@@ -1331,9 +1331,9 @@
error = -EINVAL;
if (!fmt->qf_ops->check_quota_file(sb, type))
goto out_f;
- /* We don't want quota on quota files */
+ /* We don't want quota and atime on quota files */
dquot_drop(inode);
- inode->i_flags |= S_NOQUOTA;
+ inode->i_flags |= S_NOQUOTA | S_NOATIME;
dqopt->ops[type] = fmt->qf_ops;
dqopt->info[type].dqi_format = fmt;
diff -ruX ../kerndiffexclude linux-2.4.22-fixstat/fs/ext3/super.c linux-2.4.22-deadlock/fs/ext3/super.c
--- linux-2.4.22-fixstat/fs/ext3/super.c Mon Aug 25 13:44:43 2003
+++ linux-2.4.22-deadlock/fs/ext3/super.c Mon Nov 10 11:43:32 2003
@@ -449,7 +449,7 @@
}
static struct dquot_operations ext3_qops;
-static int (*old_sync_dquot)(struct dquot *dquot);
+static int (*old_write_dquot)(struct dquot *dquot);
static struct super_operations ext3_sops = {
read_inode: ext3_read_inode, /* BKL held */
@@ -1779,7 +1779,7 @@
/* Blocks: quota info + (4 pointer blocks + 1 entry block) * (3 indirect + 1 descriptor + 1 bitmap) + superblock */
#define EXT3_V0_QFMT_BLOCKS 27
-static int ext3_sync_dquot(struct dquot *dquot)
+static int ext3_write_dquot(struct dquot *dquot)
{
int nblocks, ret;
handle_t *handle;
@@ -1804,7 +1804,7 @@
return PTR_ERR(handle);
}
unlock_kernel();
- ret = old_sync_dquot(dquot);
+ ret = old_write_dquot(dquot);
lock_kernel();
ret = ext3_journal_stop(handle, qinode);
unlock_kernel();
@@ -1818,8 +1818,8 @@
{
#ifdef CONFIG_QUOTA
init_dquot_operations(&ext3_qops);
- old_sync_dquot = ext3_qops.sync_dquot;
- ext3_qops.sync_dquot = ext3_sync_dquot;
+ old_write_dquot = ext3_qops.write_dquot;
+ ext3_qops.write_dquot = ext3_write_dquot;
#endif
return register_filesystem(&ext3_fs_type);
}
diff -ruX ../kerndiffexclude linux-2.4.22-fixstat/include/linux/quota.h linux-2.4.22-deadlock/include/linux/quota.h
--- linux-2.4.22-fixstat/include/linux/quota.h Wed Nov 5 21:27:44 2003
+++ linux-2.4.22-deadlock/include/linux/quota.h Mon Nov 10 11:40:06 2003
@@ -249,7 +249,7 @@
void (*free_space) (struct inode *, qsize_t);
void (*free_inode) (const struct inode *, unsigned long);
int (*transfer) (struct inode *, struct iattr *);
- int (*sync_dquot) (struct dquot *);
+ int (*write_dquot) (struct dquot *);
};
/* Operations handling requests from userspace */
Binary files linux-2.4.22-fixstat/linux and linux-2.4.22-deadlock/linux differ
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [BUG] journal handler reference count breaked and fs deadlocked
2003-11-10 11:13 ` Jan Kara
@ 2003-11-10 11:48 ` Alex Lyashkov
2003-11-10 11:50 ` Jan Kara
0 siblings, 1 reply; 8+ messages in thread
From: Alex Lyashkov @ 2003-11-10 11:48 UTC (permalink / raw)
To: Jan Kara; +Cc: Andrew Morton, linux-kernel, Herbert Poetzl
On Monday 10 November 2003 13:13, Jan Kara wrote:
> Hi,
>
> thanks for tracking. Are you able to reproduce the problem also on
> recent vanilla kernels (ie. 2.4.22)? Can you try the vanilla kernel with
> the attached patch (it should fix one of possible deadlocks).
>
Hi Jan
I can`t do it with vanila kernel, because my kernel not be exactly rh kernel.
It kernel from my fork vserver project who adapted to RH kernel tree.
i do stress testing and see this problems.
I see you only rename function, set NO_ATIME to diskquota..
and change at one point
- commit_dqblk(dquot);
+ dquot->dq_sb->dq_op->write_dquot(dquot);
it`s rignt ?
i probe to adapted it fix to my kernel..
--
With best regards,
Alex
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [BUG] journal handler reference count breaked and fs deadlocked
2003-11-10 11:48 ` Alex Lyashkov
@ 2003-11-10 11:50 ` Jan Kara
2003-11-10 13:07 ` Alex Lyashkov
2003-11-10 16:07 ` Alex Lyashkov
0 siblings, 2 replies; 8+ messages in thread
From: Jan Kara @ 2003-11-10 11:50 UTC (permalink / raw)
To: Alex Lyashkov; +Cc: Andrew Morton, linux-kernel, Herbert Poetzl
> On Monday 10 November 2003 13:13, Jan Kara wrote:
> > Hi,
> >
> > thanks for tracking. Are you able to reproduce the problem also on
> > recent vanilla kernels (ie. 2.4.22)? Can you try the vanilla kernel with
> > the attached patch (it should fix one of possible deadlocks).
> >
> Hi Jan
>
> I can`t do it with vanila kernel, because my kernel not be exactly rh kernel.
>
> It kernel from my fork vserver project who adapted to RH kernel tree.
> i do stress testing and see this problems.
> I see you only rename function, set NO_ATIME to diskquota..
> and change at one point
> - commit_dqblk(dquot);
> + dquot->dq_sb->dq_op->write_dquot(dquot);
> it`s rignt ?
> i probe to adapted it fix to my kernel..
Yes, that should be the only changes.
Honza
--
Jan Kara <jack@suse.cz>
SuSE CR Labs
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [BUG] journal handler reference count breaked and fs deadlocked
2003-11-10 11:50 ` Jan Kara
@ 2003-11-10 13:07 ` Alex Lyashkov
2003-11-10 16:07 ` Alex Lyashkov
1 sibling, 0 replies; 8+ messages in thread
From: Alex Lyashkov @ 2003-11-10 13:07 UTC (permalink / raw)
To: Jan Kara; +Cc: Andrew Morton, linux-kernel, Herbert Poetzl
> >
> > It kernel from my fork vserver project who adapted to RH kernel tree.
> > i do stress testing and see this problems.
> > I see you only rename function, set NO_ATIME to diskquota..
> > and change at one point
> > - commit_dqblk(dquot);
> > + dquot->dq_sb->dq_op->write_dquot(dquot);
> > it`s rignt ?
> > i probe to adapted it fix to my kernel..
>
> Yes, that should be the only changes.
>
I add it but it not fix my problem..
i enable debug print in diskquotas and see sync diskquots be finished before
kernel locked.
sample output:
============================================================
(revoke.c, 375): journal_cancel_revoke: journal_head c79d6990, cancelling
revoke
(transaction.c, 567): do_get_write_access: buffer_head c79d6990, force_copy 0
(revoke.c, 375): journal_cancel_revoke: journal_head c79d6990, cancelling
revoke
(transaction.c, 1104): journal_dirty_metadata: journal_head c79d6990
(transaction.c, 1391): journal_stop: Handle c6cd0500 going down
(transaction.c, 102): start_this_handle: New handle c6cd0500 going live.
(transaction.c, 204): start_this_handle: Handle c6cd0500 given 1 credits
(total)
(inode.c, 2587): ext3_dirty_inode: marking dirty. outer handle=00000000
(transaction.c, 567): do_get_write_access: buffer_head c79d6990, force_copy 0
(revoke.c, 375): journal_cancel_revoke: journal_head c79d6990, cancelling
revoke
(transaction.c, 567): do_get_write_access: buffer_head c79d6990, force_copy 0
(revoke.c, 375): journal_cancel_revoke: journal_head c79d6990, cancelling
revoke
(transaction.c, 1104): journal_dirty_metadata: journal_head c79d6990
(transaction.c, 1391): journal_stop: Handle c6cd0500 going down
(transaction.c, 102): start_this_handle: New handle c6cd0500 going live.
(transaction.c, 204): start_this_handle: Handle c6cd0500 given 1 credits
(total)
(inode.c, 2587): ext3_dirty_inode: marking dirty. outer handle=00000000
(transaction.c, 102): start_this_handle: New handle c6cd0340 going live.
(transaction.c, 204): start_this_handle: Handle c6cd0340 given 26 credits
(tota)
(transaction.c, 955): journal_dirty_data: jh: c79d6c00, tid:507179
(transaction.c, 1391): journal_stop: Handle c6cd0340 going down
dqput(): dq:c7b7e180
put_dqout_ref c7b7e180 341 USR 81
ctx_sync_dquots end
(journal.c, 581): log_start_commit: JBD: requesting commit 507179/507178
(journal.c, 608): log_wait_commit: JBD: want 507179, j_commit_sequence=507178
(journal.c, 263): kjournald: kjournald wakes
(journal.c, 238): kjournald: commit_sequence=507178, commit_request=507179
(journal.c, 242): kjournald: OK, requests differ
(commit.c, 81): journal_commit_transaction: JBD: starting commit of
transaction9
(commit.c, 87): journal_commit_transaction: wait updates.......
(transaction.c, 102): start_this_handle: New handle c6cd0340 going live.
(transaction.c, 136): start_this_handle: Handle c6cd0340 stalling...
(transaction.c, 567): do_get_write_access: buffer_head c79d6480, force_copy 0
(revoke.c, 375): journal_cancel_revoke: journal_head c79d6480, cancelling
revoke
(transaction.c, 567): do_get_write_access: buffer_head c79d6480, force_copy 0
(revoke.c, 375): journal_cancel_revoke: journal_head c79d6480, cancelling
revoke
(transaction.c, 1104): journal_dirty_metadata: journal_head c79d6480
(transaction.c, 567): do_get_write_access: buffer_head c79d61e0, force_copy 0
(revoke.c, 375): journal_cancel_revoke: journal_head c79d61e0, cancelling
revoke
(transaction.c, 567): do_get_write_access: buffer_head c79d61e0, force_copy 0
(revoke.c, 375): journal_cancel_revoke: journal_head c79d61e0, cancelling
revoke
(transaction.c, 1104): journal_dirty_metadata: journal_head c79d61e0
(transaction.c, 567): do_get_write_access: buffer_head c79d6990, force_copy 0
(revoke.c, 375): journal_cancel_revoke: journal_head c79d6990, cancelling
revoke
(transaction.c, 567): do_get_write_access: buffer_head c79d6990, force_copy 0
(revoke.c, 375): journal_cancel_revoke: journal_head c79d6990, cancelling
revoke
(transaction.c, 1104): journal_dirty_metadata: journal_head c79d6990
(transaction.c, 1391): journal_stop: Handle c6cd0500 going down
(transaction.c, 102): start_this_handle: New handle c6cd0500 going live.
(transaction.c, 136): start_this_handle: Handle c6cd0500 stalling...
================
if you need full log - i can upload it to freevps.com.
i think races not in a diskquotas, but in ext3fs or jbd :-\
--
With best regards,
Alex
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [BUG] journal handler reference count breaked and fs deadlocked
2003-11-10 11:50 ` Jan Kara
2003-11-10 13:07 ` Alex Lyashkov
@ 2003-11-10 16:07 ` Alex Lyashkov
1 sibling, 0 replies; 8+ messages in thread
From: Alex Lyashkov @ 2003-11-10 16:07 UTC (permalink / raw)
To: Jan Kara; +Cc: Andrew Morton, linux-kernel, Herbert Poetzl
[-- Attachment #1: Type: text/plain, Size: 981 bytes --]
On Monday 10 November 2003 13:50, you wrote:
> > On Monday 10 November 2003 13:13, Jan Kara wrote:
> > > Hi,
> > >
> > > thanks for tracking. Are you able to reproduce the problem also on
> > > recent vanilla kernels (ie. 2.4.22)? Can you try the vanilla kernel
> > > with the attached patch (it should fix one of possible deadlocks).
> >
> > Hi Jan
> >
> > I can`t do it with vanila kernel, because my kernel not be exactly rh
> > kernel.
> >
> > It kernel from my fork vserver project who adapted to RH kernel tree.
> > i do stress testing and see this problems.
> > I see you only rename function, set NO_ATIME to diskquota..
> > and change at one point
> > - commit_dqblk(dquot);
> > + dquot->dq_sb->dq_op->write_dquot(dquot);
> > it`s rignt ?
> > i probe to adapted it fix to my kernel..
>
yes. it not problem diskquota - it problem jbd.
i recompile kernel with quotasupport disabled but kernel locked again.
full log jbd attached in mail.
--
With best regards,
Alex
[-- Attachment #2: log_wo_dq.bz2 --]
[-- Type: application/x-bzip2, Size: 6723 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2003-11-10 16:07 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-11-09 21:34 [BUG] journal handler reference count breaked and fs deadlocked Alex Lyashkov
2003-11-10 9:12 ` Alex Tomas
2003-11-10 9:55 ` Alex Lyashkov
2003-11-10 11:13 ` Jan Kara
2003-11-10 11:48 ` Alex Lyashkov
2003-11-10 11:50 ` Jan Kara
2003-11-10 13:07 ` Alex Lyashkov
2003-11-10 16:07 ` Alex Lyashkov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox