From: Sonic Zhang <sonic.zhang@intel.com>
To: ocfs2-devel@oss.oracle.com
Subject: [Ocfs2-devel] About Mark's advice on bug 48
Date: Fri Mar 26 01:26:59 2004 [thread overview]
Message-ID: <4063DB4B.6060000@intel.com> (raw)
Hi Mark,
Finally, I found the second halt is caused by starvation when routine
ocfs_joutnal_set_unmounted() acquiring the lock osb->publish_lock. In
thread ocfs_volume_thread(), the delta jiffies to sleep between up() and
down() in schedule_timeout() is too short. Routine
ocfs_joutnal_set_unmounted() has no chance to check if lock
osb->publish_lock is released between it is releases and reacquired by
thread ocfs_volume_thread. So routine ocfs_journal_set_unmounted()
always waits in loop. After I change the delta jiffies from 50 to 500,
kernel 2.6 won't halt when it reboots after a OCFS volume is mounted.
I also add a line to release the lock in a branch to symbol "finally".
This may remove latent dead lock. In addition, I clear the reference
point OcfsIpcCtxt.task before thread ocfs_recv_thread() exits. This
prevents invalid access to the task structure in routine
ocfs_dismount_volume() when rebooting.
Here is my patch to file nm.c.
-------------------------------------------------------------------
--- ocfs2.old/src/nm.c.old 2004-03-26 15:21:32.000000000 +0800
+++ ocfs2/src/nm.c 2004-03-26 15:21:06.000000000 +0800
@@ -119,6 +119,8 @@
OcfsIpcCtxt.recv_sock = NULL;
}
+ OcfsIpcCtxt.task = NULL;
+
/* signal main thread of ipcdlm's exit */
complete (&(OcfsIpcCtxt.complete));
@@ -227,6 +229,12 @@
//#define OCFS_BH_SEM_PRUNE_LIMIT 60 // prune everything each 30
seconds
#define OCFS_BH_SEM_PRUNE_LIMIT 60000 // 8 hours :)
+#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,0)
+#define OCFS_SCHEDULE_TIMEOUT_JIFFIES 500
+#else
+#define OCFS_SCHEDULE_TIMEOUT_JIFFIES 50
+#endif
+
/*
* ocfs_volume_thread()
*
@@ -409,6 +417,7 @@
OCFS_BH_PUT_DATA(bh);
status = ocfs_write_bh(osb, bh, 0, NULL);
if (status < 0) {
+ up(&(osb->publish_lock));
LOG_ERROR_STATUS (status);
goto finally;
}
@@ -425,7 +434,7 @@
goto finally;
}
}
- osb->hbt = 50 + jiffies;
+ osb->hbt = OCFS_SCHEDULE_TIMEOUT_JIFFIES + jiffies;
finally:
status = 0;
@@ -435,7 +444,7 @@
break;
j = jiffies;
if (time_after (j, (unsigned long) (osb->hbt))) {
- osb->hbt = 50 + j;
+ osb->hbt = OCFS_SCHEDULE_TIMEOUT_JIFFIES + j;
}
set_current_state (TASK_INTERRUPTIBLE);
schedule_timeout (osb->hbt - j);
next reply other threads:[~2004-03-26 1:26 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-03-26 1:26 Sonic Zhang [this message]
2004-03-26 2:27 ` [Ocfs2-devel] About Mark's advice on bug 48 Sonic Zhang
2004-03-29 14:13 ` Mark Fasheh
2004-03-29 14:28 ` Mark Fasheh
-- strict thread matches above, loose matches on Subject: below --
2004-03-25 3:26 Sonic Zhang
2004-03-25 13:10 ` Mark Fasheh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4063DB4B.6060000@intel.com \
--to=sonic.zhang@intel.com \
--cc=ocfs2-devel@oss.oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.