* [Cluster-devel] [GFS2] Pre-pull patch posting
@ 2008-01-21 9:21 swhiteho
0 siblings, 0 replies; 35+ messages in thread
From: swhiteho @ 2008-01-21 9:21 UTC (permalink / raw)
To: cluster-devel.redhat.com
Hi,
Here is the current GFS2 patch queue. You'll notice that this time
there are no DLM patches in this list. That is because the DLM team
are setting up their own git tree and this future DLM patches will
be sent directly by them rather than via the GFS2 tree.
Most of this set of patches is clean up and bug fixes, there is really
not a lot new this time. I guess the most significant thing is the
patch to use ->page_mkwrite which will greatly increase efficiency
when files opened r/w are mostly only accessed for reading across a
cluster.
There are a number of cleanups related to journalling which is really
where the largest number of changes in terms of code lines is. The
indirect blocks for the journal are now scanned once only at mount
time and the bmap information is retained in the form of an extent
list. Since we expect journals to consist of only a single extent
in the normal case, this should generally be quite a short list :-)
In addition some of the tunables relating to the journal have been
removed in favour of autotuning those variables.
Steve.
^ permalink raw reply [flat|nested] 35+ messages in thread
* [Cluster-devel] [GFS2] Pre-pull patch posting
@ 2008-04-17 8:37 swhiteho
0 siblings, 0 replies; 35+ messages in thread
From: swhiteho @ 2008-04-17 8:37 UTC (permalink / raw)
To: cluster-devel.redhat.com
This is the current content of the GFS2 -nmw git tree. Mostly bug
fixes, there are some changes relating to block mapping which are
working towards cleaning up this code and allowing more efficient
block mapping. There is a second part to that work which is not
included in this patch set - the plan is that it will be in the
next patch set and its currently undergoing testing.
There are a number of clean up patches in the series too. We have
been continuing the work of gradually reducing the fields in the
various ..._host structures with a view to eventually eliminating
them completely. They were introduced as a stop-gap measure to fix
the endianess annotation and the fields are now gradually being moved
to other structures (or eliminated).
Bob Peterson's new improved "bitfit" algorithm provides a nice
speed up when we are allocating blocks as well as cleaning up
that area of the code.
Steve.
^ permalink raw reply [flat|nested] 35+ messages in thread
* [Cluster-devel] [GFS2] Pre-pull patch posting
@ 2008-07-11 10:11 swhiteho
2008-07-11 10:11 ` [Cluster-devel] [PATCH 01/18] [GFS2] Clean up the glock core swhiteho
0 siblings, 1 reply; 35+ messages in thread
From: swhiteho @ 2008-07-11 10:11 UTC (permalink / raw)
To: cluster-devel.redhat.com
So, although the merge window isn't yet open, I'm guessing that its
probably not too far away, hence this posting of the contents of
the current GFS2 -nmw git tree.
This time the big news is locking changes, although having said that,
there are far fewer queued patches in total than I've had for previous
merge windows and I believe that this is an indication of the
growing maturity of GFS2.
The first patch in the series is really the main change and is
a big clean up of the core of the glocks, which are really the
core of GFS2 in a lot of ways. Further through the series is
a documentation patch, which explains the fine detail of how
glocks work and the assumptions made by the glock core when
calling the functions relating to individual glock types.
Other notable changes include merging the lock_nolock module
into the core of GFS2 since there is little point in retaining
it separately. There is a plan to do the same to lock_dlm as
well in the future (not the DLM itself obviously, just the
interface module thats part of GFS2).
Most of the remaing changes are bug fixes or futher optimisations
over the initial glock changes, plus one or two minor clean ups
along the way.
Steve.
^ permalink raw reply [flat|nested] 35+ messages in thread
* [Cluster-devel] [PATCH 01/18] [GFS2] Clean up the glock core
2008-07-11 10:11 [Cluster-devel] [GFS2] Pre-pull patch posting swhiteho
@ 2008-07-11 10:11 ` swhiteho
2008-07-11 10:11 ` [Cluster-devel] [PATCH 02/18] [GFS2] Fix ordering bug in lock_dlm swhiteho
0 siblings, 1 reply; 35+ messages in thread
From: swhiteho @ 2008-07-11 10:11 UTC (permalink / raw)
To: cluster-devel.redhat.com
From: Steven Whitehouse <swhiteho@redhat.com>
This patch implements a number of cleanups to the core of the
GFS2 glock code. As a result a lot of code is removed. It looks
like a really big change, but actually a large part of this patch
is either removing or moving existing code.
There are some new bits too though, such as the new run_queue()
function which is considerably streamlined. Highlights of this
patch include:
o Fixes a cluster coherency bug during SH -> EX lock conversions
o Removes the "glmutex" code in favour of a single bit lock
o Removes the ->go_xmote_bh() for inodes since it was duplicating
->go_lock()
o We now only use the ->lm_lock() function for both locks and
unlocks (i.e. unlock is a lock with target mode LM_ST_UNLOCKED)
o The fast path is considerably shortly, giving performance gains
especially with lock_nolock
o The glock_workqueue is now used for all the callbacks from the DLM
which allows us to simplify the lock_dlm module (see following patch)
o The way is now open to make further changes such as eliminating the two
threads (gfs2_glockd and gfs2_scand) in favour of a more efficient
scheme.
This patch has undergone extensive testing with various test suites
so it should be pretty stable by now.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Bob Peterson <rpeterso@redhat.com>
diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index d636b3e..519a54c 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -45,21 +45,19 @@ struct gfs2_gl_hash_bucket {
struct hlist_head hb_list;
};
-struct glock_iter {
- int hash; /* hash bucket index */
- struct gfs2_sbd *sdp; /* incore superblock */
- struct gfs2_glock *gl; /* current glock struct */
- struct seq_file *seq; /* sequence file for debugfs */
- char string[512]; /* scratch space */
+struct gfs2_glock_iter {
+ int hash; /* hash bucket index */
+ struct gfs2_sbd *sdp; /* incore superblock */
+ struct gfs2_glock *gl; /* current glock struct */
+ char string[512]; /* scratch space */
};
typedef void (*glock_examiner) (struct gfs2_glock * gl);
static int gfs2_dump_lockstate(struct gfs2_sbd *sdp);
-static int dump_glock(struct glock_iter *gi, struct gfs2_glock *gl);
-static void gfs2_glock_xmote_th(struct gfs2_glock *gl, struct gfs2_holder *gh);
-static void gfs2_glock_drop_th(struct gfs2_glock *gl);
-static void run_queue(struct gfs2_glock *gl);
+static int __dump_glock(struct seq_file *seq, const struct gfs2_glock *gl);
+#define GLOCK_BUG_ON(gl,x) do { if (unlikely(x)) { __dump_glock(NULL, gl); BUG(); } } while(0)
+static void do_xmote(struct gfs2_glock *gl, struct gfs2_holder *gh, unsigned int target);
static DECLARE_RWSEM(gfs2_umount_flush_sem);
static struct dentry *gfs2_root;
@@ -123,33 +121,6 @@ static inline rwlock_t *gl_lock_addr(unsigned int x)
#endif
/**
- * relaxed_state_ok - is a requested lock compatible with the current lock mode?
- * @actual: the current state of the lock
- * @requested: the lock state that was requested by the caller
- * @flags: the modifier flags passed in by the caller
- *
- * Returns: 1 if the locks are compatible, 0 otherwise
- */
-
-static inline int relaxed_state_ok(unsigned int actual, unsigned requested,
- int flags)
-{
- if (actual == requested)
- return 1;
-
- if (flags & GL_EXACT)
- return 0;
-
- if (actual == LM_ST_EXCLUSIVE && requested == LM_ST_SHARED)
- return 1;
-
- if (actual != LM_ST_UNLOCKED && (flags & LM_FLAG_ANY))
- return 1;
-
- return 0;
-}
-
-/**
* gl_hash() - Turn glock number into hash bucket number
* @lock: The glock number
*
@@ -211,17 +182,14 @@ static void gfs2_glock_hold(struct gfs2_glock *gl)
int gfs2_glock_put(struct gfs2_glock *gl)
{
int rv = 0;
- struct gfs2_sbd *sdp = gl->gl_sbd;
write_lock(gl_lock_addr(gl->gl_hash));
if (atomic_dec_and_test(&gl->gl_ref)) {
hlist_del(&gl->gl_list);
write_unlock(gl_lock_addr(gl->gl_hash));
- gfs2_assert(sdp, gl->gl_state == LM_ST_UNLOCKED);
- gfs2_assert(sdp, list_empty(&gl->gl_reclaim));
- gfs2_assert(sdp, list_empty(&gl->gl_holders));
- gfs2_assert(sdp, list_empty(&gl->gl_waiters1));
- gfs2_assert(sdp, list_empty(&gl->gl_waiters3));
+ GLOCK_BUG_ON(gl, gl->gl_state != LM_ST_UNLOCKED);
+ GLOCK_BUG_ON(gl, !list_empty(&gl->gl_reclaim));
+ GLOCK_BUG_ON(gl, !list_empty(&gl->gl_holders));
glock_free(gl);
rv = 1;
goto out;
@@ -281,16 +249,382 @@ static struct gfs2_glock *gfs2_glock_find(const struct gfs2_sbd *sdp,
return gl;
}
+/**
+ * may_grant - check if its ok to grant a new lock
+ * @gl: The glock
+ * @gh: The lock request which we wish to grant
+ *
+ * Returns: true if its ok to grant the lock
+ */
+
+static inline int may_grant(const struct gfs2_glock *gl, const struct gfs2_holder *gh)
+{
+ const struct gfs2_holder *gh_head = list_entry(gl->gl_holders.next, const struct gfs2_holder, gh_list);
+ if ((gh->gh_state == LM_ST_EXCLUSIVE ||
+ gh_head->gh_state == LM_ST_EXCLUSIVE) && gh != gh_head)
+ return 0;
+ if (gl->gl_state == gh->gh_state)
+ return 1;
+ if (gh->gh_flags & GL_EXACT)
+ return 0;
+ if (gh->gh_state == LM_ST_SHARED && gl->gl_state == LM_ST_EXCLUSIVE)
+ return 1;
+ if (gl->gl_state != LM_ST_UNLOCKED && (gh->gh_flags & LM_FLAG_ANY))
+ return 1;
+ return 0;
+}
+
+static void gfs2_holder_wake(struct gfs2_holder *gh)
+{
+ clear_bit(HIF_WAIT, &gh->gh_iflags);
+ smp_mb__after_clear_bit();
+ wake_up_bit(&gh->gh_iflags, HIF_WAIT);
+}
+
+/**
+ * do_promote - promote as many requests as possible on the current queue
+ * @gl: The glock
+ *
+ * Returns: true if there is a blocked holder at the head of the list
+ */
+
+static int do_promote(struct gfs2_glock *gl)
+{
+ const struct gfs2_glock_operations *glops = gl->gl_ops;
+ struct gfs2_holder *gh, *tmp;
+ int ret;
+
+restart:
+ list_for_each_entry_safe(gh, tmp, &gl->gl_holders, gh_list) {
+ if (test_bit(HIF_HOLDER, &gh->gh_iflags))
+ continue;
+ if (may_grant(gl, gh)) {
+ if (gh->gh_list.prev == &gl->gl_holders &&
+ glops->go_lock) {
+ spin_unlock(&gl->gl_spin);
+ /* FIXME: eliminate this eventually */
+ ret = glops->go_lock(gh);
+ spin_lock(&gl->gl_spin);
+ if (ret) {
+ gh->gh_error = ret;
+ list_del_init(&gh->gh_list);
+ gfs2_holder_wake(gh);
+ goto restart;
+ }
+ set_bit(HIF_HOLDER, &gh->gh_iflags);
+ gfs2_holder_wake(gh);
+ goto restart;
+ }
+ set_bit(HIF_HOLDER, &gh->gh_iflags);
+ gfs2_holder_wake(gh);
+ continue;
+ }
+ if (gh->gh_list.prev == &gl->gl_holders)
+ return 1;
+ break;
+ }
+ return 0;
+}
+
+/**
+ * do_error - Something unexpected has happened during a lock request
+ *
+ */
+
+static inline void do_error(struct gfs2_glock *gl, const int ret)
+{
+ struct gfs2_holder *gh, *tmp;
+
+ list_for_each_entry_safe(gh, tmp, &gl->gl_holders, gh_list) {
+ if (test_bit(HIF_HOLDER, &gh->gh_iflags))
+ continue;
+ if (ret & LM_OUT_ERROR)
+ gh->gh_error = -EIO;
+ else if (gh->gh_flags & (LM_FLAG_TRY | LM_FLAG_TRY_1CB))
+ gh->gh_error = GLR_TRYFAILED;
+ else
+ continue;
+ list_del_init(&gh->gh_list);
+ gfs2_holder_wake(gh);
+ }
+}
+
+/**
+ * find_first_waiter - find the first gh that's waiting for the glock
+ * @gl: the glock
+ */
+
+static inline struct gfs2_holder *find_first_waiter(const struct gfs2_glock *gl)
+{
+ struct gfs2_holder *gh;
+
+ list_for_each_entry(gh, &gl->gl_holders, gh_list) {
+ if (!test_bit(HIF_HOLDER, &gh->gh_iflags))
+ return gh;
+ }
+ return NULL;
+}
+
+/**
+ * state_change - record that the glock is now in a different state
+ * @gl: the glock
+ * @new_state the new state
+ *
+ */
+
+static void state_change(struct gfs2_glock *gl, unsigned int new_state)
+{
+ int held1, held2;
+
+ held1 = (gl->gl_state != LM_ST_UNLOCKED);
+ held2 = (new_state != LM_ST_UNLOCKED);
+
+ if (held1 != held2) {
+ if (held2)
+ gfs2_glock_hold(gl);
+ else
+ gfs2_glock_put(gl);
+ }
+
+ gl->gl_state = new_state;
+ gl->gl_tchange = jiffies;
+}
+
+static void gfs2_demote_wake(struct gfs2_glock *gl)
+{
+ gl->gl_demote_state = LM_ST_EXCLUSIVE;
+ clear_bit(GLF_DEMOTE, &gl->gl_flags);
+ smp_mb__after_clear_bit();
+ wake_up_bit(&gl->gl_flags, GLF_DEMOTE);
+}
+
+/**
+ * finish_xmote - The DLM has replied to one of our lock requests
+ * @gl: The glock
+ * @ret: The status from the DLM
+ *
+ */
+
+static void finish_xmote(struct gfs2_glock *gl, unsigned int ret)
+{
+ const struct gfs2_glock_operations *glops = gl->gl_ops;
+ struct gfs2_holder *gh;
+ unsigned state = ret & LM_OUT_ST_MASK;
+
+ spin_lock(&gl->gl_spin);
+ state_change(gl, state);
+ gh = find_first_waiter(gl);
+
+ /* Demote to UN request arrived during demote to SH or DF */
+ if (test_bit(GLF_DEMOTE_IN_PROGRESS, &gl->gl_flags) &&
+ state != LM_ST_UNLOCKED && gl->gl_demote_state == LM_ST_UNLOCKED)
+ gl->gl_target = LM_ST_UNLOCKED;
+
+ /* Check for state != intended state */
+ if (unlikely(state != gl->gl_target)) {
+ if (gh && !test_bit(GLF_DEMOTE_IN_PROGRESS, &gl->gl_flags)) {
+ /* move to back of queue and try next entry */
+ if (ret & LM_OUT_CANCELED) {
+ if ((gh->gh_flags & LM_FLAG_PRIORITY) == 0)
+ list_move_tail(&gh->gh_list, &gl->gl_holders);
+ gh = find_first_waiter(gl);
+ gl->gl_target = gh->gh_state;
+ goto retry;
+ }
+ /* Some error or failed "try lock" - report it */
+ if ((ret & LM_OUT_ERROR) ||
+ (gh->gh_flags & (LM_FLAG_TRY | LM_FLAG_TRY_1CB))) {
+ gl->gl_target = gl->gl_state;
+ do_error(gl, ret);
+ goto out;
+ }
+ }
+ switch(state) {
+ /* Unlocked due to conversion deadlock, try again */
+ case LM_ST_UNLOCKED:
+retry:
+ do_xmote(gl, gh, gl->gl_target);
+ break;
+ /* Conversion fails, unlock and try again */
+ case LM_ST_SHARED:
+ case LM_ST_DEFERRED:
+ do_xmote(gl, gh, LM_ST_UNLOCKED);
+ break;
+ default: /* Everything else */
+ printk(KERN_ERR "GFS2: wanted %u got %u\n", gl->gl_target, state);
+ GLOCK_BUG_ON(gl, 1);
+ }
+ spin_unlock(&gl->gl_spin);
+ gfs2_glock_put(gl);
+ return;
+ }
+
+ /* Fast path - we got what we asked for */
+ if (test_and_clear_bit(GLF_DEMOTE_IN_PROGRESS, &gl->gl_flags))
+ gfs2_demote_wake(gl);
+ if (state != LM_ST_UNLOCKED) {
+ if (glops->go_xmote_bh) {
+ int rv;
+ spin_unlock(&gl->gl_spin);
+ rv = glops->go_xmote_bh(gl, gh);
+ if (rv == -EAGAIN)
+ return;
+ spin_lock(&gl->gl_spin);
+ if (rv) {
+ do_error(gl, rv);
+ goto out;
+ }
+ }
+ do_promote(gl);
+ }
+out:
+ clear_bit(GLF_LOCK, &gl->gl_flags);
+ spin_unlock(&gl->gl_spin);
+ gfs2_glock_put(gl);
+}
+
+static unsigned int gfs2_lm_lock(struct gfs2_sbd *sdp, void *lock,
+ unsigned int cur_state, unsigned int req_state,
+ unsigned int flags)
+{
+ int ret = LM_OUT_ERROR;
+ if (likely(!test_bit(SDF_SHUTDOWN, &sdp->sd_flags)))
+ ret = sdp->sd_lockstruct.ls_ops->lm_lock(lock, cur_state,
+ req_state, flags);
+ return ret;
+}
+
+/**
+ * do_xmote - Calls the DLM to change the state of a lock
+ * @gl: The lock state
+ * @gh: The holder (only for promotes)
+ * @target: The target lock state
+ *
+ */
+
+static void do_xmote(struct gfs2_glock *gl, struct gfs2_holder *gh, unsigned int target)
+{
+ const struct gfs2_glock_operations *glops = gl->gl_ops;
+ struct gfs2_sbd *sdp = gl->gl_sbd;
+ unsigned int lck_flags = gh ? gh->gh_flags : 0;
+ int ret;
+
+ lck_flags &= (LM_FLAG_TRY | LM_FLAG_TRY_1CB | LM_FLAG_NOEXP |
+ LM_FLAG_PRIORITY);
+ BUG_ON(gl->gl_state == target);
+ BUG_ON(gl->gl_state == gl->gl_target);
+ if ((target == LM_ST_UNLOCKED || target == LM_ST_DEFERRED) &&
+ glops->go_inval) {
+ set_bit(GLF_INVALIDATE_IN_PROGRESS, &gl->gl_flags);
+ do_error(gl, 0); /* Fail queued try locks */
+ }
+ spin_unlock(&gl->gl_spin);
+ if (glops->go_xmote_th)
+ glops->go_xmote_th(gl);
+ if (test_bit(GLF_INVALIDATE_IN_PROGRESS, &gl->gl_flags))
+ glops->go_inval(gl, target == LM_ST_DEFERRED ? 0 : DIO_METADATA);
+ clear_bit(GLF_INVALIDATE_IN_PROGRESS, &gl->gl_flags);
+
+ gfs2_glock_hold(gl);
+ if (target != LM_ST_UNLOCKED && (gl->gl_state == LM_ST_SHARED ||
+ gl->gl_state == LM_ST_DEFERRED) &&
+ !(lck_flags & (LM_FLAG_TRY | LM_FLAG_TRY_1CB)))
+ lck_flags |= LM_FLAG_TRY_1CB;
+ ret = gfs2_lm_lock(sdp, gl->gl_lock, gl->gl_state, target, lck_flags);
+
+ if (!(ret & LM_OUT_ASYNC)) {
+ finish_xmote(gl, ret);
+ gfs2_glock_hold(gl);
+ if (queue_delayed_work(glock_workqueue, &gl->gl_work, 0) == 0)
+ gfs2_glock_put(gl);
+ } else {
+ GLOCK_BUG_ON(gl, ret != LM_OUT_ASYNC);
+ }
+ spin_lock(&gl->gl_spin);
+}
+
+/**
+ * find_first_holder - find the first "holder" gh
+ * @gl: the glock
+ */
+
+static inline struct gfs2_holder *find_first_holder(const struct gfs2_glock *gl)
+{
+ struct gfs2_holder *gh;
+
+ if (!list_empty(&gl->gl_holders)) {
+ gh = list_entry(gl->gl_holders.next, struct gfs2_holder, gh_list);
+ if (test_bit(HIF_HOLDER, &gh->gh_iflags))
+ return gh;
+ }
+ return NULL;
+}
+
+/**
+ * run_queue - do all outstanding tasks related to a glock
+ * @gl: The glock in question
+ * @nonblock: True if we must not block in run_queue
+ *
+ */
+
+static void run_queue(struct gfs2_glock *gl, const int nonblock)
+{
+ struct gfs2_holder *gh = NULL;
+
+ if (test_and_set_bit(GLF_LOCK, &gl->gl_flags))
+ return;
+
+ GLOCK_BUG_ON(gl, test_bit(GLF_DEMOTE_IN_PROGRESS, &gl->gl_flags));
+
+ if (test_bit(GLF_DEMOTE, &gl->gl_flags) &&
+ gl->gl_demote_state != gl->gl_state) {
+ if (find_first_holder(gl))
+ goto out;
+ if (nonblock)
+ goto out_sched;
+ set_bit(GLF_DEMOTE_IN_PROGRESS, &gl->gl_flags);
+ gl->gl_target = gl->gl_demote_state;
+ } else {
+ if (test_bit(GLF_DEMOTE, &gl->gl_flags))
+ gfs2_demote_wake(gl);
+ if (do_promote(gl) == 0)
+ goto out;
+ gh = find_first_waiter(gl);
+ gl->gl_target = gh->gh_state;
+ if (!(gh->gh_flags & (LM_FLAG_TRY | LM_FLAG_TRY_1CB)))
+ do_error(gl, 0); /* Fail queued try locks */
+ }
+ do_xmote(gl, gh, gl->gl_target);
+ return;
+
+out_sched:
+ gfs2_glock_hold(gl);
+ if (queue_delayed_work(glock_workqueue, &gl->gl_work, 0) == 0)
+ gfs2_glock_put(gl);
+out:
+ clear_bit(GLF_LOCK, &gl->gl_flags);
+}
+
static void glock_work_func(struct work_struct *work)
{
+ unsigned long delay = 0;
struct gfs2_glock *gl = container_of(work, struct gfs2_glock, gl_work.work);
+ if (test_and_clear_bit(GLF_REPLY_PENDING, &gl->gl_flags))
+ finish_xmote(gl, gl->gl_reply);
spin_lock(&gl->gl_spin);
- if (test_and_clear_bit(GLF_PENDING_DEMOTE, &gl->gl_flags))
- set_bit(GLF_DEMOTE, &gl->gl_flags);
- run_queue(gl);
+ if (test_and_clear_bit(GLF_PENDING_DEMOTE, &gl->gl_flags)) {
+ unsigned long holdtime, now = jiffies;
+ holdtime = gl->gl_tchange + gl->gl_ops->go_min_hold_time;
+ if (time_before(now, holdtime))
+ delay = holdtime - now;
+ set_bit(delay ? GLF_PENDING_DEMOTE : GLF_DEMOTE, &gl->gl_flags);
+ }
+ run_queue(gl, 0);
spin_unlock(&gl->gl_spin);
- gfs2_glock_put(gl);
+ if (!delay ||
+ queue_delayed_work(glock_workqueue, &gl->gl_work, delay) == 0)
+ gfs2_glock_put(gl);
}
static int gfs2_lm_get_lock(struct gfs2_sbd *sdp, struct lm_lockname *name,
@@ -342,12 +676,10 @@ int gfs2_glock_get(struct gfs2_sbd *sdp, u64 number,
gl->gl_name = name;
atomic_set(&gl->gl_ref, 1);
gl->gl_state = LM_ST_UNLOCKED;
+ gl->gl_target = LM_ST_UNLOCKED;
gl->gl_demote_state = LM_ST_EXCLUSIVE;
gl->gl_hash = hash;
- gl->gl_owner_pid = NULL;
- gl->gl_ip = 0;
gl->gl_ops = glops;
- gl->gl_req_gh = NULL;
gl->gl_stamp = jiffies;
gl->gl_tchange = jiffies;
gl->gl_object = NULL;
@@ -447,13 +779,6 @@ void gfs2_holder_uninit(struct gfs2_holder *gh)
gh->gh_ip = 0;
}
-static void gfs2_holder_wake(struct gfs2_holder *gh)
-{
- clear_bit(HIF_WAIT, &gh->gh_iflags);
- smp_mb__after_clear_bit();
- wake_up_bit(&gh->gh_iflags, HIF_WAIT);
-}
-
static int just_schedule(void *word)
{
schedule();
@@ -466,14 +791,6 @@ static void wait_on_holder(struct gfs2_holder *gh)
wait_on_bit(&gh->gh_iflags, HIF_WAIT, just_schedule, TASK_UNINTERRUPTIBLE);
}
-static void gfs2_demote_wake(struct gfs2_glock *gl)
-{
- gl->gl_demote_state = LM_ST_EXCLUSIVE;
- clear_bit(GLF_DEMOTE, &gl->gl_flags);
- smp_mb__after_clear_bit();
- wake_up_bit(&gl->gl_flags, GLF_DEMOTE);
-}
-
static void wait_on_demote(struct gfs2_glock *gl)
{
might_sleep();
@@ -481,217 +798,6 @@ static void wait_on_demote(struct gfs2_glock *gl)
}
/**
- * rq_mutex - process a mutex request in the queue
- * @gh: the glock holder
- *
- * Returns: 1 if the queue is blocked
- */
-
-static int rq_mutex(struct gfs2_holder *gh)
-{
- struct gfs2_glock *gl = gh->gh_gl;
-
- list_del_init(&gh->gh_list);
- /* gh->gh_error never examined. */
- set_bit(GLF_LOCK, &gl->gl_flags);
- clear_bit(HIF_WAIT, &gh->gh_iflags);
- smp_mb();
- wake_up_bit(&gh->gh_iflags, HIF_WAIT);
-
- return 1;
-}
-
-/**
- * rq_promote - process a promote request in the queue
- * @gh: the glock holder
- *
- * Acquire a new inter-node lock, or change a lock state to more restrictive.
- *
- * Returns: 1 if the queue is blocked
- */
-
-static int rq_promote(struct gfs2_holder *gh)
-{
- struct gfs2_glock *gl = gh->gh_gl;
-
- if (!relaxed_state_ok(gl->gl_state, gh->gh_state, gh->gh_flags)) {
- if (list_empty(&gl->gl_holders)) {
- gl->gl_req_gh = gh;
- set_bit(GLF_LOCK, &gl->gl_flags);
- spin_unlock(&gl->gl_spin);
- gfs2_glock_xmote_th(gh->gh_gl, gh);
- spin_lock(&gl->gl_spin);
- }
- return 1;
- }
-
- if (list_empty(&gl->gl_holders)) {
- set_bit(HIF_FIRST, &gh->gh_iflags);
- set_bit(GLF_LOCK, &gl->gl_flags);
- } else {
- struct gfs2_holder *next_gh;
- if (gh->gh_state == LM_ST_EXCLUSIVE)
- return 1;
- next_gh = list_entry(gl->gl_holders.next, struct gfs2_holder,
- gh_list);
- if (next_gh->gh_state == LM_ST_EXCLUSIVE)
- return 1;
- }
-
- list_move_tail(&gh->gh_list, &gl->gl_holders);
- gh->gh_error = 0;
- set_bit(HIF_HOLDER, &gh->gh_iflags);
-
- gfs2_holder_wake(gh);
-
- return 0;
-}
-
-/**
- * rq_demote - process a demote request in the queue
- * @gh: the glock holder
- *
- * Returns: 1 if the queue is blocked
- */
-
-static int rq_demote(struct gfs2_glock *gl)
-{
- if (!list_empty(&gl->gl_holders))
- return 1;
-
- if (gl->gl_state == gl->gl_demote_state ||
- gl->gl_state == LM_ST_UNLOCKED) {
- gfs2_demote_wake(gl);
- return 0;
- }
-
- set_bit(GLF_LOCK, &gl->gl_flags);
- set_bit(GLF_DEMOTE_IN_PROGRESS, &gl->gl_flags);
-
- if (gl->gl_demote_state == LM_ST_UNLOCKED ||
- gl->gl_state != LM_ST_EXCLUSIVE) {
- spin_unlock(&gl->gl_spin);
- gfs2_glock_drop_th(gl);
- } else {
- spin_unlock(&gl->gl_spin);
- gfs2_glock_xmote_th(gl, NULL);
- }
-
- spin_lock(&gl->gl_spin);
- clear_bit(GLF_DEMOTE_IN_PROGRESS, &gl->gl_flags);
-
- return 0;
-}
-
-/**
- * run_queue - process holder structures on a glock
- * @gl: the glock
- *
- */
-static void run_queue(struct gfs2_glock *gl)
-{
- struct gfs2_holder *gh;
- int blocked = 1;
-
- for (;;) {
- if (test_bit(GLF_LOCK, &gl->gl_flags))
- break;
-
- if (!list_empty(&gl->gl_waiters1)) {
- gh = list_entry(gl->gl_waiters1.next,
- struct gfs2_holder, gh_list);
- blocked = rq_mutex(gh);
- } else if (test_bit(GLF_DEMOTE, &gl->gl_flags)) {
- blocked = rq_demote(gl);
- if (test_bit(GLF_WAITERS2, &gl->gl_flags) &&
- !blocked) {
- set_bit(GLF_DEMOTE, &gl->gl_flags);
- gl->gl_demote_state = LM_ST_UNLOCKED;
- }
- clear_bit(GLF_WAITERS2, &gl->gl_flags);
- } else if (!list_empty(&gl->gl_waiters3)) {
- gh = list_entry(gl->gl_waiters3.next,
- struct gfs2_holder, gh_list);
- blocked = rq_promote(gh);
- } else
- break;
-
- if (blocked)
- break;
- }
-}
-
-/**
- * gfs2_glmutex_lock - acquire a local lock on a glock
- * @gl: the glock
- *
- * Gives caller exclusive access to manipulate a glock structure.
- */
-
-static void gfs2_glmutex_lock(struct gfs2_glock *gl)
-{
- spin_lock(&gl->gl_spin);
- if (test_and_set_bit(GLF_LOCK, &gl->gl_flags)) {
- struct gfs2_holder gh;
-
- gfs2_holder_init(gl, 0, 0, &gh);
- set_bit(HIF_WAIT, &gh.gh_iflags);
- list_add_tail(&gh.gh_list, &gl->gl_waiters1);
- spin_unlock(&gl->gl_spin);
- wait_on_holder(&gh);
- gfs2_holder_uninit(&gh);
- } else {
- gl->gl_owner_pid = get_pid(task_pid(current));
- gl->gl_ip = (unsigned long)__builtin_return_address(0);
- spin_unlock(&gl->gl_spin);
- }
-}
-
-/**
- * gfs2_glmutex_trylock - try to acquire a local lock on a glock
- * @gl: the glock
- *
- * Returns: 1 if the glock is acquired
- */
-
-static int gfs2_glmutex_trylock(struct gfs2_glock *gl)
-{
- int acquired = 1;
-
- spin_lock(&gl->gl_spin);
- if (test_and_set_bit(GLF_LOCK, &gl->gl_flags)) {
- acquired = 0;
- } else {
- gl->gl_owner_pid = get_pid(task_pid(current));
- gl->gl_ip = (unsigned long)__builtin_return_address(0);
- }
- spin_unlock(&gl->gl_spin);
-
- return acquired;
-}
-
-/**
- * gfs2_glmutex_unlock - release a local lock on a glock
- * @gl: the glock
- *
- */
-
-static void gfs2_glmutex_unlock(struct gfs2_glock *gl)
-{
- struct pid *pid;
-
- spin_lock(&gl->gl_spin);
- clear_bit(GLF_LOCK, &gl->gl_flags);
- pid = gl->gl_owner_pid;
- gl->gl_owner_pid = NULL;
- gl->gl_ip = 0;
- run_queue(gl);
- spin_unlock(&gl->gl_spin);
-
- put_pid(pid);
-}
-
-/**
* handle_callback - process a demote request
* @gl: the glock
* @state: the state the caller wants us to change to
@@ -705,398 +811,45 @@ static void handle_callback(struct gfs2_glock *gl, unsigned int state,
{
int bit = delay ? GLF_PENDING_DEMOTE : GLF_DEMOTE;
- spin_lock(&gl->gl_spin);
set_bit(bit, &gl->gl_flags);
if (gl->gl_demote_state == LM_ST_EXCLUSIVE) {
gl->gl_demote_state = state;
gl->gl_demote_time = jiffies;
if (remote && gl->gl_ops->go_type == LM_TYPE_IOPEN &&
- gl->gl_object) {
+ gl->gl_object)
gfs2_glock_schedule_for_reclaim(gl);
- spin_unlock(&gl->gl_spin);
- return;
- }
} else if (gl->gl_demote_state != LM_ST_UNLOCKED &&
gl->gl_demote_state != state) {
- if (test_bit(GLF_DEMOTE_IN_PROGRESS, &gl->gl_flags))
- set_bit(GLF_WAITERS2, &gl->gl_flags);
- else
- gl->gl_demote_state = LM_ST_UNLOCKED;
+ gl->gl_demote_state = LM_ST_UNLOCKED;
}
- spin_unlock(&gl->gl_spin);
}
/**
- * state_change - record that the glock is now in a different state
- * @gl: the glock
- * @new_state the new state
- *
- */
-
-static void state_change(struct gfs2_glock *gl, unsigned int new_state)
-{
- int held1, held2;
-
- held1 = (gl->gl_state != LM_ST_UNLOCKED);
- held2 = (new_state != LM_ST_UNLOCKED);
-
- if (held1 != held2) {
- if (held2)
- gfs2_glock_hold(gl);
- else
- gfs2_glock_put(gl);
- }
-
- gl->gl_state = new_state;
- gl->gl_tchange = jiffies;
-}
-
-/**
- * drop_bh - Called after a lock module unlock completes
- * @gl: the glock
- * @ret: the return status
- *
- * Doesn't wake up the process waiting on the struct gfs2_holder (if any)
- * Doesn't drop the reference on the glock the top half took out
- *
- */
-
-static void drop_bh(struct gfs2_glock *gl, unsigned int ret)
-{
- struct gfs2_sbd *sdp = gl->gl_sbd;
- struct gfs2_holder *gh = gl->gl_req_gh;
-
- gfs2_assert_warn(sdp, test_bit(GLF_LOCK, &gl->gl_flags));
- gfs2_assert_warn(sdp, list_empty(&gl->gl_holders));
- gfs2_assert_warn(sdp, !ret);
-
- state_change(gl, LM_ST_UNLOCKED);
-
- if (test_and_clear_bit(GLF_CONV_DEADLK, &gl->gl_flags)) {
- spin_lock(&gl->gl_spin);
- gh->gh_error = 0;
- spin_unlock(&gl->gl_spin);
- gfs2_glock_xmote_th(gl, gl->gl_req_gh);
- gfs2_glock_put(gl);
- return;
- }
-
- spin_lock(&gl->gl_spin);
- gfs2_demote_wake(gl);
- clear_bit(GLF_LOCK, &gl->gl_flags);
- spin_unlock(&gl->gl_spin);
- gfs2_glock_put(gl);
-}
-
-/**
- * xmote_bh - Called after the lock module is done acquiring a lock
- * @gl: The glock in question
- * @ret: the int returned from the lock module
- *
- */
-
-static void xmote_bh(struct gfs2_glock *gl, unsigned int ret)
-{
- struct gfs2_sbd *sdp = gl->gl_sbd;
- const struct gfs2_glock_operations *glops = gl->gl_ops;
- struct gfs2_holder *gh = gl->gl_req_gh;
- int op_done = 1;
-
- if (!gh && (ret & LM_OUT_ST_MASK) == LM_ST_UNLOCKED) {
- drop_bh(gl, ret);
- return;
- }
-
- gfs2_assert_warn(sdp, test_bit(GLF_LOCK, &gl->gl_flags));
- gfs2_assert_warn(sdp, list_empty(&gl->gl_holders));
- gfs2_assert_warn(sdp, !(ret & LM_OUT_ASYNC));
-
- state_change(gl, ret & LM_OUT_ST_MASK);
-
- /* Deal with each possible exit condition */
-
- if (!gh) {
- gl->gl_stamp = jiffies;
- if (ret & LM_OUT_CANCELED) {
- op_done = 0;
- } else {
- spin_lock(&gl->gl_spin);
- if (gl->gl_state != gl->gl_demote_state) {
- spin_unlock(&gl->gl_spin);
- gfs2_glock_drop_th(gl);
- gfs2_glock_put(gl);
- return;
- }
- gfs2_demote_wake(gl);
- spin_unlock(&gl->gl_spin);
- }
- } else {
- spin_lock(&gl->gl_spin);
- if (ret & LM_OUT_CONV_DEADLK) {
- gh->gh_error = 0;
- set_bit(GLF_CONV_DEADLK, &gl->gl_flags);
- spin_unlock(&gl->gl_spin);
- gfs2_glock_drop_th(gl);
- gfs2_glock_put(gl);
- return;
- }
- list_del_init(&gh->gh_list);
- gh->gh_error = -EIO;
- if (unlikely(test_bit(SDF_SHUTDOWN, &sdp->sd_flags)))
- goto out;
- gh->gh_error = GLR_CANCELED;
- if (ret & LM_OUT_CANCELED)
- goto out;
- if (relaxed_state_ok(gl->gl_state, gh->gh_state, gh->gh_flags)) {
- list_add_tail(&gh->gh_list, &gl->gl_holders);
- gh->gh_error = 0;
- set_bit(HIF_HOLDER, &gh->gh_iflags);
- set_bit(HIF_FIRST, &gh->gh_iflags);
- op_done = 0;
- goto out;
- }
- gh->gh_error = GLR_TRYFAILED;
- if (gh->gh_flags & (LM_FLAG_TRY | LM_FLAG_TRY_1CB))
- goto out;
- gh->gh_error = -EINVAL;
- if (gfs2_assert_withdraw(sdp, 0) == -1)
- fs_err(sdp, "ret = 0x%.8X\n", ret);
-out:
- spin_unlock(&gl->gl_spin);
- }
-
- if (glops->go_xmote_bh)
- glops->go_xmote_bh(gl);
-
- if (op_done) {
- spin_lock(&gl->gl_spin);
- gl->gl_req_gh = NULL;
- clear_bit(GLF_LOCK, &gl->gl_flags);
- spin_unlock(&gl->gl_spin);
- }
-
- gfs2_glock_put(gl);
-
- if (gh)
- gfs2_holder_wake(gh);
-}
-
-static unsigned int gfs2_lm_lock(struct gfs2_sbd *sdp, void *lock,
- unsigned int cur_state, unsigned int req_state,
- unsigned int flags)
-{
- int ret = 0;
- if (likely(!test_bit(SDF_SHUTDOWN, &sdp->sd_flags)))
- ret = sdp->sd_lockstruct.ls_ops->lm_lock(lock, cur_state,
- req_state, flags);
- return ret;
-}
-
-/**
- * gfs2_glock_xmote_th - Call into the lock module to acquire or change a glock
- * @gl: The glock in question
- * @state: the requested state
- * @flags: modifier flags to the lock call
- *
- */
-
-static void gfs2_glock_xmote_th(struct gfs2_glock *gl, struct gfs2_holder *gh)
-{
- struct gfs2_sbd *sdp = gl->gl_sbd;
- int flags = gh ? gh->gh_flags : 0;
- unsigned state = gh ? gh->gh_state : gl->gl_demote_state;
- const struct gfs2_glock_operations *glops = gl->gl_ops;
- int lck_flags = flags & (LM_FLAG_TRY | LM_FLAG_TRY_1CB |
- LM_FLAG_NOEXP | LM_FLAG_ANY |
- LM_FLAG_PRIORITY);
- unsigned int lck_ret;
-
- if (glops->go_xmote_th)
- glops->go_xmote_th(gl);
- if (state == LM_ST_DEFERRED && glops->go_inval)
- glops->go_inval(gl, DIO_METADATA);
-
- gfs2_assert_warn(sdp, test_bit(GLF_LOCK, &gl->gl_flags));
- gfs2_assert_warn(sdp, list_empty(&gl->gl_holders));
- gfs2_assert_warn(sdp, state != LM_ST_UNLOCKED);
- gfs2_assert_warn(sdp, state != gl->gl_state);
-
- gfs2_glock_hold(gl);
-
- lck_ret = gfs2_lm_lock(sdp, gl->gl_lock, gl->gl_state, state, lck_flags);
-
- if (gfs2_assert_withdraw(sdp, !(lck_ret & LM_OUT_ERROR)))
- return;
-
- if (lck_ret & LM_OUT_ASYNC)
- gfs2_assert_warn(sdp, lck_ret == LM_OUT_ASYNC);
- else
- xmote_bh(gl, lck_ret);
-}
-
-static unsigned int gfs2_lm_unlock(struct gfs2_sbd *sdp, void *lock,
- unsigned int cur_state)
-{
- int ret = 0;
- if (likely(!test_bit(SDF_SHUTDOWN, &sdp->sd_flags)))
- ret = sdp->sd_lockstruct.ls_ops->lm_unlock(lock, cur_state);
- return ret;
-}
-
-/**
- * gfs2_glock_drop_th - call into the lock module to unlock a lock
- * @gl: the glock
- *
- */
-
-static void gfs2_glock_drop_th(struct gfs2_glock *gl)
-{
- struct gfs2_sbd *sdp = gl->gl_sbd;
- const struct gfs2_glock_operations *glops = gl->gl_ops;
- unsigned int ret;
-
- if (glops->go_xmote_th)
- glops->go_xmote_th(gl);
- if (glops->go_inval)
- glops->go_inval(gl, DIO_METADATA);
-
- gfs2_assert_warn(sdp, test_bit(GLF_LOCK, &gl->gl_flags));
- gfs2_assert_warn(sdp, list_empty(&gl->gl_holders));
- gfs2_assert_warn(sdp, gl->gl_state != LM_ST_UNLOCKED);
-
- gfs2_glock_hold(gl);
-
- ret = gfs2_lm_unlock(sdp, gl->gl_lock, gl->gl_state);
-
- if (gfs2_assert_withdraw(sdp, !(ret & LM_OUT_ERROR)))
- return;
-
- if (!ret)
- drop_bh(gl, ret);
- else
- gfs2_assert_warn(sdp, ret == LM_OUT_ASYNC);
-}
-
-/**
- * do_cancels - cancel requests for locks stuck waiting on an expire flag
- * @gh: the LM_FLAG_PRIORITY holder waiting to acquire the lock
- *
- * Don't cancel GL_NOCANCEL requests.
- */
-
-static void do_cancels(struct gfs2_holder *gh)
-{
- struct gfs2_glock *gl = gh->gh_gl;
- struct gfs2_sbd *sdp = gl->gl_sbd;
-
- spin_lock(&gl->gl_spin);
-
- while (gl->gl_req_gh != gh &&
- !test_bit(HIF_HOLDER, &gh->gh_iflags) &&
- !list_empty(&gh->gh_list)) {
- if (!(gl->gl_req_gh && (gl->gl_req_gh->gh_flags & GL_NOCANCEL))) {
- spin_unlock(&gl->gl_spin);
- if (likely(!test_bit(SDF_SHUTDOWN, &sdp->sd_flags)))
- sdp->sd_lockstruct.ls_ops->lm_cancel(gl->gl_lock);
- msleep(100);
- spin_lock(&gl->gl_spin);
- } else {
- spin_unlock(&gl->gl_spin);
- msleep(100);
- spin_lock(&gl->gl_spin);
- }
- }
-
- spin_unlock(&gl->gl_spin);
-}
-
-/**
- * glock_wait_internal - wait on a glock acquisition
+ * gfs2_glock_wait - wait on a glock acquisition
* @gh: the glock holder
*
* Returns: 0 on success
*/
-static int glock_wait_internal(struct gfs2_holder *gh)
+int gfs2_glock_wait(struct gfs2_holder *gh)
{
- struct gfs2_glock *gl = gh->gh_gl;
- struct gfs2_sbd *sdp = gl->gl_sbd;
- const struct gfs2_glock_operations *glops = gl->gl_ops;
-
- if (test_bit(HIF_ABORTED, &gh->gh_iflags))
- return -EIO;
-
- if (gh->gh_flags & (LM_FLAG_TRY | LM_FLAG_TRY_1CB)) {
- spin_lock(&gl->gl_spin);
- if (gl->gl_req_gh != gh &&
- !test_bit(HIF_HOLDER, &gh->gh_iflags) &&
- !list_empty(&gh->gh_list)) {
- list_del_init(&gh->gh_list);
- gh->gh_error = GLR_TRYFAILED;
- run_queue(gl);
- spin_unlock(&gl->gl_spin);
- return gh->gh_error;
- }
- spin_unlock(&gl->gl_spin);
- }
-
- if (gh->gh_flags & LM_FLAG_PRIORITY)
- do_cancels(gh);
-
wait_on_holder(gh);
- if (gh->gh_error)
- return gh->gh_error;
-
- gfs2_assert_withdraw(sdp, test_bit(HIF_HOLDER, &gh->gh_iflags));
- gfs2_assert_withdraw(sdp, relaxed_state_ok(gl->gl_state, gh->gh_state,
- gh->gh_flags));
-
- if (test_bit(HIF_FIRST, &gh->gh_iflags)) {
- gfs2_assert_warn(sdp, test_bit(GLF_LOCK, &gl->gl_flags));
-
- if (glops->go_lock) {
- gh->gh_error = glops->go_lock(gh);
- if (gh->gh_error) {
- spin_lock(&gl->gl_spin);
- list_del_init(&gh->gh_list);
- spin_unlock(&gl->gl_spin);
- }
- }
-
- spin_lock(&gl->gl_spin);
- gl->gl_req_gh = NULL;
- clear_bit(GLF_LOCK, &gl->gl_flags);
- run_queue(gl);
- spin_unlock(&gl->gl_spin);
- }
-
return gh->gh_error;
}
-static inline struct gfs2_holder *
-find_holder_by_owner(struct list_head *head, struct pid *pid)
-{
- struct gfs2_holder *gh;
-
- list_for_each_entry(gh, head, gh_list) {
- if (gh->gh_owner_pid == pid)
- return gh;
- }
-
- return NULL;
-}
-
-static void print_dbg(struct glock_iter *gi, const char *fmt, ...)
+void gfs2_print_dbg(struct seq_file *seq, const char *fmt, ...)
{
va_list args;
va_start(args, fmt);
- if (gi) {
+ if (seq) {
+ struct gfs2_glock_iter *gi = seq->private;
vsprintf(gi->string, fmt, args);
- seq_printf(gi->seq, gi->string);
- }
- else
+ seq_printf(seq, gi->string);
+ } else {
+ printk(KERN_ERR " ");
vprintk(fmt, args);
+ }
va_end(args);
}
@@ -1104,50 +857,75 @@ static void print_dbg(struct glock_iter *gi, const char *fmt, ...)
* add_to_queue - Add a holder to the wait queue (but look for recursion)
* @gh: the holder structure to add
*
+ * Eventually we should move the recursive locking trap to a
+ * debugging option or something like that. This is the fast
+ * path and needs to have the minimum number of distractions.
+ *
*/
-static void add_to_queue(struct gfs2_holder *gh)
+static inline void add_to_queue(struct gfs2_holder *gh)
{
struct gfs2_glock *gl = gh->gh_gl;
- struct gfs2_holder *existing;
+ struct gfs2_sbd *sdp = gl->gl_sbd;
+ struct list_head *insert_pt = NULL;
+ struct gfs2_holder *gh2;
+ int try_lock = 0;
BUG_ON(gh->gh_owner_pid == NULL);
if (test_and_set_bit(HIF_WAIT, &gh->gh_iflags))
BUG();
- if (!(gh->gh_flags & GL_FLOCK)) {
- existing = find_holder_by_owner(&gl->gl_holders,
- gh->gh_owner_pid);
- if (existing) {
- print_symbol(KERN_WARNING "original: %s\n",
- existing->gh_ip);
- printk(KERN_INFO "pid : %d\n",
- pid_nr(existing->gh_owner_pid));
- printk(KERN_INFO "lock type : %d lock state : %d\n",
- existing->gh_gl->gl_name.ln_type,
- existing->gh_gl->gl_state);
- print_symbol(KERN_WARNING "new: %s\n", gh->gh_ip);
- printk(KERN_INFO "pid : %d\n",
- pid_nr(gh->gh_owner_pid));
- printk(KERN_INFO "lock type : %d lock state : %d\n",
- gl->gl_name.ln_type, gl->gl_state);
- BUG();
- }
-
- existing = find_holder_by_owner(&gl->gl_waiters3,
- gh->gh_owner_pid);
- if (existing) {
- print_symbol(KERN_WARNING "original: %s\n",
- existing->gh_ip);
- print_symbol(KERN_WARNING "new: %s\n", gh->gh_ip);
- BUG();
+ if (gh->gh_flags & (LM_FLAG_TRY | LM_FLAG_TRY_1CB)) {
+ if (test_bit(GLF_LOCK, &gl->gl_flags))
+ try_lock = 1;
+ if (test_bit(GLF_INVALIDATE_IN_PROGRESS, &gl->gl_flags))
+ goto fail;
+ }
+
+ list_for_each_entry(gh2, &gl->gl_holders, gh_list) {
+ if (unlikely(gh2->gh_owner_pid == gh->gh_owner_pid &&
+ (gh->gh_gl->gl_ops->go_type != LM_TYPE_FLOCK)))
+ goto trap_recursive;
+ if (try_lock &&
+ !(gh2->gh_flags & (LM_FLAG_TRY | LM_FLAG_TRY_1CB)) &&
+ !may_grant(gl, gh)) {
+fail:
+ gh->gh_error = GLR_TRYFAILED;
+ gfs2_holder_wake(gh);
+ return;
}
+ if (test_bit(HIF_HOLDER, &gh2->gh_iflags))
+ continue;
+ if (unlikely((gh->gh_flags & LM_FLAG_PRIORITY) && !insert_pt))
+ insert_pt = &gh2->gh_list;
+ }
+ if (likely(insert_pt == NULL)) {
+ list_add_tail(&gh->gh_list, &gl->gl_holders);
+ if (unlikely(gh->gh_flags & LM_FLAG_PRIORITY))
+ goto do_cancel;
+ return;
}
+ list_add_tail(&gh->gh_list, insert_pt);
+do_cancel:
+ gh = list_entry(gl->gl_holders.next, struct gfs2_holder, gh_list);
+ if (!(gh->gh_flags & LM_FLAG_PRIORITY)) {
+ spin_unlock(&gl->gl_spin);
+ sdp->sd_lockstruct.ls_ops->lm_cancel(gl->gl_lock);
+ spin_lock(&gl->gl_spin);
+ }
+ return;
- if (gh->gh_flags & LM_FLAG_PRIORITY)
- list_add(&gh->gh_list, &gl->gl_waiters3);
- else
- list_add_tail(&gh->gh_list, &gl->gl_waiters3);
+trap_recursive:
+ print_symbol(KERN_ERR "original: %s\n", gh2->gh_ip);
+ printk(KERN_ERR "pid: %d\n", pid_nr(gh2->gh_owner_pid));
+ printk(KERN_ERR "lock type: %d req lock state : %d\n",
+ gh2->gh_gl->gl_name.ln_type, gh2->gh_state);
+ print_symbol(KERN_ERR "new: %s\n", gh->gh_ip);
+ printk(KERN_ERR "pid: %d\n", pid_nr(gh->gh_owner_pid));
+ printk(KERN_ERR "lock type: %d req lock state : %d\n",
+ gh->gh_gl->gl_name.ln_type, gh->gh_state);
+ __dump_glock(NULL, gl);
+ BUG();
}
/**
@@ -1165,24 +943,16 @@ int gfs2_glock_nq(struct gfs2_holder *gh)
struct gfs2_sbd *sdp = gl->gl_sbd;
int error = 0;
-restart:
- if (unlikely(test_bit(SDF_SHUTDOWN, &sdp->sd_flags))) {
- set_bit(HIF_ABORTED, &gh->gh_iflags);
+ if (unlikely(test_bit(SDF_SHUTDOWN, &sdp->sd_flags)))
return -EIO;
- }
spin_lock(&gl->gl_spin);
add_to_queue(gh);
- run_queue(gl);
+ run_queue(gl, 1);
spin_unlock(&gl->gl_spin);
- if (!(gh->gh_flags & GL_ASYNC)) {
- error = glock_wait_internal(gh);
- if (error == GLR_CANCELED) {
- msleep(100);
- goto restart;
- }
- }
+ if (!(gh->gh_flags & GL_ASYNC))
+ error = gfs2_glock_wait(gh);
return error;
}
@@ -1196,48 +966,7 @@ restart:
int gfs2_glock_poll(struct gfs2_holder *gh)
{
- struct gfs2_glock *gl = gh->gh_gl;
- int ready = 0;
-
- spin_lock(&gl->gl_spin);
-
- if (test_bit(HIF_HOLDER, &gh->gh_iflags))
- ready = 1;
- else if (list_empty(&gh->gh_list)) {
- if (gh->gh_error == GLR_CANCELED) {
- spin_unlock(&gl->gl_spin);
- msleep(100);
- if (gfs2_glock_nq(gh))
- return 1;
- return 0;
- } else
- ready = 1;
- }
-
- spin_unlock(&gl->gl_spin);
-
- return ready;
-}
-
-/**
- * gfs2_glock_wait - wait for a lock acquisition that ended in a GLR_ASYNC
- * @gh: the holder structure
- *
- * Returns: 0, GLR_TRYFAILED, or errno on failure
- */
-
-int gfs2_glock_wait(struct gfs2_holder *gh)
-{
- int error;
-
- error = glock_wait_internal(gh);
- if (error == GLR_CANCELED) {
- msleep(100);
- gh->gh_flags &= ~GL_ASYNC;
- error = gfs2_glock_nq(gh);
- }
-
- return error;
+ return test_bit(HIF_WAIT, &gh->gh_iflags) ? 0 : 1;
}
/**
@@ -1251,26 +980,30 @@ void gfs2_glock_dq(struct gfs2_holder *gh)
struct gfs2_glock *gl = gh->gh_gl;
const struct gfs2_glock_operations *glops = gl->gl_ops;
unsigned delay = 0;
+ int fast_path = 0;
+ spin_lock(&gl->gl_spin);
if (gh->gh_flags & GL_NOCACHE)
handle_callback(gl, LM_ST_UNLOCKED, 0, 0);
- gfs2_glmutex_lock(gl);
-
- spin_lock(&gl->gl_spin);
list_del_init(&gh->gh_list);
-
- if (list_empty(&gl->gl_holders)) {
+ if (find_first_holder(gl) == NULL) {
if (glops->go_unlock) {
+ GLOCK_BUG_ON(gl, test_and_set_bit(GLF_LOCK, &gl->gl_flags));
spin_unlock(&gl->gl_spin);
glops->go_unlock(gh);
spin_lock(&gl->gl_spin);
+ clear_bit(GLF_LOCK, &gl->gl_flags);
}
gl->gl_stamp = jiffies;
+ if (list_empty(&gl->gl_holders) &&
+ !test_bit(GLF_PENDING_DEMOTE, &gl->gl_flags) &&
+ !test_bit(GLF_DEMOTE, &gl->gl_flags))
+ fast_path = 1;
}
-
- clear_bit(GLF_LOCK, &gl->gl_flags);
spin_unlock(&gl->gl_spin);
+ if (likely(fast_path))
+ return;
gfs2_glock_hold(gl);
if (test_bit(GLF_PENDING_DEMOTE, &gl->gl_flags) &&
@@ -1469,20 +1202,14 @@ int gfs2_lvb_hold(struct gfs2_glock *gl)
{
int error;
- gfs2_glmutex_lock(gl);
-
if (!atomic_read(&gl->gl_lvb_count)) {
error = gfs2_lm_hold_lvb(gl->gl_sbd, gl->gl_lock, &gl->gl_lvb);
- if (error) {
- gfs2_glmutex_unlock(gl);
+ if (error)
return error;
- }
gfs2_glock_hold(gl);
}
atomic_inc(&gl->gl_lvb_count);
- gfs2_glmutex_unlock(gl);
-
return 0;
}
@@ -1497,8 +1224,6 @@ void gfs2_lvb_unhold(struct gfs2_glock *gl)
struct gfs2_sbd *sdp = gl->gl_sbd;
gfs2_glock_hold(gl);
- gfs2_glmutex_lock(gl);
-
gfs2_assert(gl->gl_sbd, atomic_read(&gl->gl_lvb_count) > 0);
if (atomic_dec_and_test(&gl->gl_lvb_count)) {
if (likely(!test_bit(SDF_SHUTDOWN, &sdp->sd_flags)))
@@ -1506,8 +1231,6 @@ void gfs2_lvb_unhold(struct gfs2_glock *gl)
gl->gl_lvb = NULL;
gfs2_glock_put(gl);
}
-
- gfs2_glmutex_unlock(gl);
gfs2_glock_put(gl);
}
@@ -1527,7 +1250,9 @@ static void blocking_cb(struct gfs2_sbd *sdp, struct lm_lockname *name,
if (time_before(now, holdtime))
delay = holdtime - now;
+ spin_lock(&gl->gl_spin);
handle_callback(gl, state, 1, delay);
+ spin_unlock(&gl->gl_spin);
if (queue_delayed_work(glock_workqueue, &gl->gl_work, delay) == 0)
gfs2_glock_put(gl);
}
@@ -1568,7 +1293,8 @@ void gfs2_glock_cb(void *cb_data, unsigned int type, void *data)
gl = gfs2_glock_find(sdp, &async->lc_name);
if (gfs2_assert_warn(sdp, gl))
return;
- xmote_bh(gl, async->lc_ret);
+ gl->gl_reply = async->lc_ret;
+ set_bit(GLF_REPLY_PENDING, &gl->gl_flags);
if (queue_delayed_work(glock_workqueue, &gl->gl_work, 0) == 0)
gfs2_glock_put(gl);
up_read(&gfs2_umount_flush_sem);
@@ -1646,6 +1372,7 @@ void gfs2_glock_schedule_for_reclaim(struct gfs2_glock *gl)
void gfs2_reclaim_glock(struct gfs2_sbd *sdp)
{
struct gfs2_glock *gl;
+ int done_callback = 0;
spin_lock(&sdp->sd_reclaim_lock);
if (list_empty(&sdp->sd_reclaim_list)) {
@@ -1660,14 +1387,16 @@ void gfs2_reclaim_glock(struct gfs2_sbd *sdp)
atomic_dec(&sdp->sd_reclaim_count);
atomic_inc(&sdp->sd_reclaimed);
- if (gfs2_glmutex_trylock(gl)) {
- if (list_empty(&gl->gl_holders) &&
- gl->gl_state != LM_ST_UNLOCKED && demote_ok(gl))
- handle_callback(gl, LM_ST_UNLOCKED, 0, 0);
- gfs2_glmutex_unlock(gl);
+ spin_lock(&gl->gl_spin);
+ if (find_first_holder(gl) == NULL &&
+ gl->gl_state != LM_ST_UNLOCKED && demote_ok(gl)) {
+ handle_callback(gl, LM_ST_UNLOCKED, 0, 0);
+ done_callback = 1;
}
-
- gfs2_glock_put(gl);
+ spin_unlock(&gl->gl_spin);
+ if (!done_callback ||
+ queue_delayed_work(glock_workqueue, &gl->gl_work, 0) == 0)
+ gfs2_glock_put(gl);
}
/**
@@ -1724,18 +1453,14 @@ static void scan_glock(struct gfs2_glock *gl)
{
if (gl->gl_ops == &gfs2_inode_glops && gl->gl_object)
return;
+ if (test_bit(GLF_LOCK, &gl->gl_flags))
+ return;
- if (gfs2_glmutex_trylock(gl)) {
- if (list_empty(&gl->gl_holders) &&
- gl->gl_state != LM_ST_UNLOCKED && demote_ok(gl))
- goto out_schedule;
- gfs2_glmutex_unlock(gl);
- }
- return;
-
-out_schedule:
- gfs2_glmutex_unlock(gl);
- gfs2_glock_schedule_for_reclaim(gl);
+ spin_lock(&gl->gl_spin);
+ if (find_first_holder(gl) == NULL &&
+ gl->gl_state != LM_ST_UNLOCKED && demote_ok(gl))
+ gfs2_glock_schedule_for_reclaim(gl);
+ spin_unlock(&gl->gl_spin);
}
/**
@@ -1760,12 +1485,13 @@ static void clear_glock(struct gfs2_glock *gl)
spin_unlock(&sdp->sd_reclaim_lock);
}
- if (gfs2_glmutex_trylock(gl)) {
- if (list_empty(&gl->gl_holders) &&
- gl->gl_state != LM_ST_UNLOCKED)
- handle_callback(gl, LM_ST_UNLOCKED, 0, 0);
- gfs2_glmutex_unlock(gl);
- }
+ spin_lock(&gl->gl_spin);
+ if (find_first_holder(gl) == NULL && gl->gl_state != LM_ST_UNLOCKED)
+ handle_callback(gl, LM_ST_UNLOCKED, 0, 0);
+ spin_unlock(&gl->gl_spin);
+ gfs2_glock_hold(gl);
+ if (queue_delayed_work(glock_workqueue, &gl->gl_work, 0) == 0)
+ gfs2_glock_put(gl);
}
/**
@@ -1810,180 +1536,164 @@ void gfs2_gl_hash_clear(struct gfs2_sbd *sdp, int wait)
}
}
-/*
- * Diagnostic routines to help debug distributed deadlock
- */
-
-static void gfs2_print_symbol(struct glock_iter *gi, const char *fmt,
- unsigned long address)
+static const char *state2str(unsigned state)
{
- char buffer[KSYM_SYMBOL_LEN];
-
- sprint_symbol(buffer, address);
- print_dbg(gi, fmt, buffer);
+ switch(state) {
+ case LM_ST_UNLOCKED:
+ return "UN";
+ case LM_ST_SHARED:
+ return "SH";
+ case LM_ST_DEFERRED:
+ return "DF";
+ case LM_ST_EXCLUSIVE:
+ return "EX";
+ }
+ return "??";
+}
+
+static const char *hflags2str(char *buf, unsigned flags, unsigned long iflags)
+{
+ char *p = buf;
+ if (flags & LM_FLAG_TRY)
+ *p++ = 't';
+ if (flags & LM_FLAG_TRY_1CB)
+ *p++ = 'T';
+ if (flags & LM_FLAG_NOEXP)
+ *p++ = 'e';
+ if (flags & LM_FLAG_ANY)
+ *p++ = 'a';
+ if (flags & LM_FLAG_PRIORITY)
+ *p++ = 'p';
+ if (flags & GL_ASYNC)
+ *p++ = 'a';
+ if (flags & GL_EXACT)
+ *p++ = 'E';
+ if (flags & GL_ATIME)
+ *p++ = 'a';
+ if (flags & GL_NOCACHE)
+ *p++ = 'c';
+ if (test_bit(HIF_HOLDER, &iflags))
+ *p++ = 'H';
+ if (test_bit(HIF_WAIT, &iflags))
+ *p++ = 'W';
+ if (test_bit(HIF_FIRST, &iflags))
+ *p++ = 'F';
+ *p = 0;
+ return buf;
}
/**
* dump_holder - print information about a glock holder
- * @str: a string naming the type of holder
+ * @seq: the seq_file struct
* @gh: the glock holder
*
* Returns: 0 on success, -ENOBUFS when we run out of space
*/
-static int dump_holder(struct glock_iter *gi, char *str,
- struct gfs2_holder *gh)
+static int dump_holder(struct seq_file *seq, const struct gfs2_holder *gh)
{
- unsigned int x;
- struct task_struct *gh_owner;
+ struct task_struct *gh_owner = NULL;
+ char buffer[KSYM_SYMBOL_LEN];
+ char flags_buf[32];
- print_dbg(gi, " %s\n", str);
- if (gh->gh_owner_pid) {
- print_dbg(gi, " owner = %ld ",
- (long)pid_nr(gh->gh_owner_pid));
+ sprint_symbol(buffer, gh->gh_ip);
+ if (gh->gh_owner_pid)
gh_owner = pid_task(gh->gh_owner_pid, PIDTYPE_PID);
- if (gh_owner)
- print_dbg(gi, "(%s)\n", gh_owner->comm);
- else
- print_dbg(gi, "(ended)\n");
- } else
- print_dbg(gi, " owner = -1\n");
- print_dbg(gi, " gh_state = %u\n", gh->gh_state);
- print_dbg(gi, " gh_flags =");
- for (x = 0; x < 32; x++)
- if (gh->gh_flags & (1 << x))
- print_dbg(gi, " %u", x);
- print_dbg(gi, " \n");
- print_dbg(gi, " error = %d\n", gh->gh_error);
- print_dbg(gi, " gh_iflags =");
- for (x = 0; x < 32; x++)
- if (test_bit(x, &gh->gh_iflags))
- print_dbg(gi, " %u", x);
- print_dbg(gi, " \n");
- gfs2_print_symbol(gi, " initialized at: %s\n", gh->gh_ip);
-
+ gfs2_print_dbg(seq, " H: s:%s f:%s e:%d p:%ld [%s] %s\n",
+ state2str(gh->gh_state),
+ hflags2str(flags_buf, gh->gh_flags, gh->gh_iflags),
+ gh->gh_error,
+ gh->gh_owner_pid ? (long)pid_nr(gh->gh_owner_pid) : -1,
+ gh_owner ? gh_owner->comm : "(ended)", buffer);
return 0;
}
-/**
- * dump_inode - print information about an inode
- * @ip: the inode
- *
- * Returns: 0 on success, -ENOBUFS when we run out of space
- */
-
-static int dump_inode(struct glock_iter *gi, struct gfs2_inode *ip)
-{
- unsigned int x;
-
- print_dbg(gi, " Inode:\n");
- print_dbg(gi, " num = %llu/%llu\n",
- (unsigned long long)ip->i_no_formal_ino,
- (unsigned long long)ip->i_no_addr);
- print_dbg(gi, " type = %u\n", IF2DT(ip->i_inode.i_mode));
- print_dbg(gi, " i_flags =");
- for (x = 0; x < 32; x++)
- if (test_bit(x, &ip->i_flags))
- print_dbg(gi, " %u", x);
- print_dbg(gi, " \n");
- return 0;
+static const char *gflags2str(char *buf, const unsigned long *gflags)
+{
+ char *p = buf;
+ if (test_bit(GLF_LOCK, gflags))
+ *p++ = 'l';
+ if (test_bit(GLF_STICKY, gflags))
+ *p++ = 's';
+ if (test_bit(GLF_DEMOTE, gflags))
+ *p++ = 'D';
+ if (test_bit(GLF_PENDING_DEMOTE, gflags))
+ *p++ = 'd';
+ if (test_bit(GLF_DEMOTE_IN_PROGRESS, gflags))
+ *p++ = 'p';
+ if (test_bit(GLF_DIRTY, gflags))
+ *p++ = 'y';
+ if (test_bit(GLF_LFLUSH, gflags))
+ *p++ = 'f';
+ if (test_bit(GLF_INVALIDATE_IN_PROGRESS, gflags))
+ *p++ = 'i';
+ if (test_bit(GLF_REPLY_PENDING, gflags))
+ *p++ = 'r';
+ *p = 0;
+ return buf;
}
/**
- * dump_glock - print information about a glock
+ * __dump_glock - print information about a glock
+ * @seq: The seq_file struct
* @gl: the glock
- * @count: where we are in the buffer
+ *
+ * The file format is as follows:
+ * One line per object, capital letters are used to indicate objects
+ * G = glock, I = Inode, R = rgrp, H = holder. Glocks are not indented,
+ * other objects are indented by a single space and follow the glock to
+ * which they are related. Fields are indicated by lower case letters
+ * followed by a colon and the field value, except for strings which are in
+ * [] so that its possible to see if they are composed of spaces for
+ * example. The field's are n = number (id of the object), f = flags,
+ * t = type, s = state, r = refcount, e = error, p = pid.
*
* Returns: 0 on success, -ENOBUFS when we run out of space
*/
-static int dump_glock(struct glock_iter *gi, struct gfs2_glock *gl)
+static int __dump_glock(struct seq_file *seq, const struct gfs2_glock *gl)
{
- struct gfs2_holder *gh;
- unsigned int x;
- int error = -ENOBUFS;
- struct task_struct *gl_owner;
+ const struct gfs2_glock_operations *glops = gl->gl_ops;
+ unsigned long long dtime;
+ const struct gfs2_holder *gh;
+ char gflags_buf[32];
+ int error = 0;
- spin_lock(&gl->gl_spin);
+ dtime = jiffies - gl->gl_demote_time;
+ dtime *= 1000000/HZ; /* demote time in uSec */
+ if (!test_bit(GLF_DEMOTE, &gl->gl_flags))
+ dtime = 0;
+ gfs2_print_dbg(seq, "G: s:%s n:%u/%llu f:%s t:%s d:%s/%llu l:%d a:%d r:%d\n",
+ state2str(gl->gl_state),
+ gl->gl_name.ln_type,
+ (unsigned long long)gl->gl_name.ln_number,
+ gflags2str(gflags_buf, &gl->gl_flags),
+ state2str(gl->gl_target),
+ state2str(gl->gl_demote_state), dtime,
+ atomic_read(&gl->gl_lvb_count),
+ atomic_read(&gl->gl_ail_count),
+ atomic_read(&gl->gl_ref));
- print_dbg(gi, "Glock 0x%p (%u, 0x%llx)\n", gl, gl->gl_name.ln_type,
- (unsigned long long)gl->gl_name.ln_number);
- print_dbg(gi, " gl_flags =");
- for (x = 0; x < 32; x++) {
- if (test_bit(x, &gl->gl_flags))
- print_dbg(gi, " %u", x);
- }
- if (!test_bit(GLF_LOCK, &gl->gl_flags))
- print_dbg(gi, " (unlocked)");
- print_dbg(gi, " \n");
- print_dbg(gi, " gl_ref = %d\n", atomic_read(&gl->gl_ref));
- print_dbg(gi, " gl_state = %u\n", gl->gl_state);
- if (gl->gl_owner_pid) {
- gl_owner = pid_task(gl->gl_owner_pid, PIDTYPE_PID);
- if (gl_owner)
- print_dbg(gi, " gl_owner = pid %d (%s)\n",
- pid_nr(gl->gl_owner_pid), gl_owner->comm);
- else
- print_dbg(gi, " gl_owner = %d (ended)\n",
- pid_nr(gl->gl_owner_pid));
- } else
- print_dbg(gi, " gl_owner = -1\n");
- print_dbg(gi, " gl_ip = %lu\n", gl->gl_ip);
- print_dbg(gi, " req_gh = %s\n", (gl->gl_req_gh) ? "yes" : "no");
- print_dbg(gi, " lvb_count = %d\n", atomic_read(&gl->gl_lvb_count));
- print_dbg(gi, " object = %s\n", (gl->gl_object) ? "yes" : "no");
- print_dbg(gi, " reclaim = %s\n",
- (list_empty(&gl->gl_reclaim)) ? "no" : "yes");
- if (gl->gl_aspace)
- print_dbg(gi, " aspace = 0x%p nrpages = %lu\n", gl->gl_aspace,
- gl->gl_aspace->i_mapping->nrpages);
- else
- print_dbg(gi, " aspace = no\n");
- print_dbg(gi, " ail = %d\n", atomic_read(&gl->gl_ail_count));
- if (gl->gl_req_gh) {
- error = dump_holder(gi, "Request", gl->gl_req_gh);
- if (error)
- goto out;
- }
list_for_each_entry(gh, &gl->gl_holders, gh_list) {
- error = dump_holder(gi, "Holder", gh);
- if (error)
- goto out;
- }
- list_for_each_entry(gh, &gl->gl_waiters1, gh_list) {
- error = dump_holder(gi, "Waiter1", gh);
- if (error)
- goto out;
- }
- list_for_each_entry(gh, &gl->gl_waiters3, gh_list) {
- error = dump_holder(gi, "Waiter3", gh);
+ error = dump_holder(seq, gh);
if (error)
goto out;
}
- if (test_bit(GLF_DEMOTE, &gl->gl_flags)) {
- print_dbg(gi, " Demotion req to state %u (%llu uS ago)\n",
- gl->gl_demote_state, (unsigned long long)
- (jiffies - gl->gl_demote_time)*(1000000/HZ));
- }
- if (gl->gl_ops == &gfs2_inode_glops && gl->gl_object) {
- if (!test_bit(GLF_LOCK, &gl->gl_flags) &&
- list_empty(&gl->gl_holders)) {
- error = dump_inode(gi, gl->gl_object);
- if (error)
- goto out;
- } else {
- error = -ENOBUFS;
- print_dbg(gi, " Inode: busy\n");
- }
- }
-
- error = 0;
-
+ if (gl->gl_state != LM_ST_UNLOCKED && glops->go_dump)
+ error = glops->go_dump(seq, gl);
out:
- spin_unlock(&gl->gl_spin);
return error;
}
+static int dump_glock(struct seq_file *seq, struct gfs2_glock *gl)
+{
+ int ret;
+ spin_lock(&gl->gl_spin);
+ ret = __dump_glock(seq, gl);
+ spin_unlock(&gl->gl_spin);
+ return ret;
+}
+
/**
* gfs2_dump_lockstate - print out the current lockstate
* @sdp: the filesystem
@@ -2086,7 +1796,7 @@ void gfs2_glock_exit(void)
module_param(scand_secs, uint, S_IRUGO|S_IWUSR);
MODULE_PARM_DESC(scand_secs, "The number of seconds between scand runs");
-static int gfs2_glock_iter_next(struct glock_iter *gi)
+static int gfs2_glock_iter_next(struct gfs2_glock_iter *gi)
{
struct gfs2_glock *gl;
@@ -2104,7 +1814,7 @@ restart:
gfs2_glock_put(gl);
if (gl && gi->gl == NULL)
gi->hash++;
- while(gi->gl == NULL) {
+ while (gi->gl == NULL) {
if (gi->hash >= GFS2_GL_HASH_SIZE)
return 1;
read_lock(gl_lock_addr(gi->hash));
@@ -2122,58 +1832,34 @@ restart:
return 0;
}
-static void gfs2_glock_iter_free(struct glock_iter *gi)
+static void gfs2_glock_iter_free(struct gfs2_glock_iter *gi)
{
if (gi->gl)
gfs2_glock_put(gi->gl);
- kfree(gi);
-}
-
-static struct glock_iter *gfs2_glock_iter_init(struct gfs2_sbd *sdp)
-{
- struct glock_iter *gi;
-
- gi = kmalloc(sizeof (*gi), GFP_KERNEL);
- if (!gi)
- return NULL;
-
- gi->sdp = sdp;
- gi->hash = 0;
- gi->seq = NULL;
gi->gl = NULL;
- memset(gi->string, 0, sizeof(gi->string));
-
- if (gfs2_glock_iter_next(gi)) {
- gfs2_glock_iter_free(gi);
- return NULL;
- }
-
- return gi;
}
-static void *gfs2_glock_seq_start(struct seq_file *file, loff_t *pos)
+static void *gfs2_glock_seq_start(struct seq_file *seq, loff_t *pos)
{
- struct glock_iter *gi;
+ struct gfs2_glock_iter *gi = seq->private;
loff_t n = *pos;
- gi = gfs2_glock_iter_init(file->private);
- if (!gi)
- return NULL;
+ gi->hash = 0;
- while(n--) {
+ do {
if (gfs2_glock_iter_next(gi)) {
gfs2_glock_iter_free(gi);
return NULL;
}
- }
+ } while (n--);
- return gi;
+ return gi->gl;
}
-static void *gfs2_glock_seq_next(struct seq_file *file, void *iter_ptr,
+static void *gfs2_glock_seq_next(struct seq_file *seq, void *iter_ptr,
loff_t *pos)
{
- struct glock_iter *gi = iter_ptr;
+ struct gfs2_glock_iter *gi = seq->private;
(*pos)++;
@@ -2182,24 +1868,18 @@ static void *gfs2_glock_seq_next(struct seq_file *file, void *iter_ptr,
return NULL;
}
- return gi;
+ return gi->gl;
}
-static void gfs2_glock_seq_stop(struct seq_file *file, void *iter_ptr)
+static void gfs2_glock_seq_stop(struct seq_file *seq, void *iter_ptr)
{
- struct glock_iter *gi = iter_ptr;
- if (gi)
- gfs2_glock_iter_free(gi);
+ struct gfs2_glock_iter *gi = seq->private;
+ gfs2_glock_iter_free(gi);
}
-static int gfs2_glock_seq_show(struct seq_file *file, void *iter_ptr)
+static int gfs2_glock_seq_show(struct seq_file *seq, void *iter_ptr)
{
- struct glock_iter *gi = iter_ptr;
-
- gi->seq = file;
- dump_glock(gi, gi->gl);
-
- return 0;
+ return dump_glock(seq, iter_ptr);
}
static const struct seq_operations gfs2_glock_seq_ops = {
@@ -2211,17 +1891,14 @@ static const struct seq_operations gfs2_glock_seq_ops = {
static int gfs2_debugfs_open(struct inode *inode, struct file *file)
{
- struct seq_file *seq;
- int ret;
-
- ret = seq_open(file, &gfs2_glock_seq_ops);
- if (ret)
- return ret;
-
- seq = file->private_data;
- seq->private = inode->i_private;
-
- return 0;
+ int ret = seq_open_private(file, &gfs2_glock_seq_ops,
+ sizeof(struct gfs2_glock_iter));
+ if (ret == 0) {
+ struct seq_file *seq = file->private_data;
+ struct gfs2_glock_iter *gi = seq->private;
+ gi->sdp = inode->i_private;
+ }
+ return ret;
}
static const struct file_operations gfs2_debug_fops = {
@@ -2229,7 +1906,7 @@ static const struct file_operations gfs2_debug_fops = {
.open = gfs2_debugfs_open,
.read = seq_read,
.llseek = seq_lseek,
- .release = seq_release
+ .release = seq_release_private,
};
int gfs2_create_debugfs_file(struct gfs2_sbd *sdp)
diff --git a/fs/gfs2/glock.h b/fs/gfs2/glock.h
index cdad3e6..7389f8e 100644
--- a/fs/gfs2/glock.h
+++ b/fs/gfs2/glock.h
@@ -26,11 +26,8 @@
#define GL_SKIP 0x00000100
#define GL_ATIME 0x00000200
#define GL_NOCACHE 0x00000400
-#define GL_FLOCK 0x00000800
-#define GL_NOCANCEL 0x00001000
#define GLR_TRYFAILED 13
-#define GLR_CANCELED 14
static inline struct gfs2_holder *gfs2_glock_is_locked_by_me(struct gfs2_glock *gl)
{
@@ -41,6 +38,8 @@ static inline struct gfs2_holder *gfs2_glock_is_locked_by_me(struct gfs2_glock *
spin_lock(&gl->gl_spin);
pid = task_pid(current);
list_for_each_entry(gh, &gl->gl_holders, gh_list) {
+ if (!test_bit(HIF_HOLDER, &gh->gh_iflags))
+ break;
if (gh->gh_owner_pid == pid)
goto out;
}
@@ -70,7 +69,7 @@ static inline int gfs2_glock_is_blocking(struct gfs2_glock *gl)
{
int ret;
spin_lock(&gl->gl_spin);
- ret = test_bit(GLF_DEMOTE, &gl->gl_flags) || !list_empty(&gl->gl_waiters3);
+ ret = test_bit(GLF_DEMOTE, &gl->gl_flags);
spin_unlock(&gl->gl_spin);
return ret;
}
@@ -98,6 +97,7 @@ int gfs2_glock_nq_num(struct gfs2_sbd *sdp,
int gfs2_glock_nq_m(unsigned int num_gh, struct gfs2_holder *ghs);
void gfs2_glock_dq_m(unsigned int num_gh, struct gfs2_holder *ghs);
void gfs2_glock_dq_uninit_m(unsigned int num_gh, struct gfs2_holder *ghs);
+void gfs2_print_dbg(struct seq_file *seq, const char *fmt, ...);
/**
* gfs2_glock_nq_init - intialize a holder and enqueue it on a glock
@@ -130,7 +130,6 @@ int gfs2_lvb_hold(struct gfs2_glock *gl);
void gfs2_lvb_unhold(struct gfs2_glock *gl);
void gfs2_glock_cb(void *cb_data, unsigned int type, void *data);
-
void gfs2_glock_schedule_for_reclaim(struct gfs2_glock *gl);
void gfs2_reclaim_glock(struct gfs2_sbd *sdp);
void gfs2_gl_hash_clear(struct gfs2_sbd *sdp, int wait);
diff --git a/fs/gfs2/glops.c b/fs/gfs2/glops.c
index 07d84d1..c6c318c 100644
--- a/fs/gfs2/glops.c
+++ b/fs/gfs2/glops.c
@@ -13,6 +13,7 @@
#include <linux/buffer_head.h>
#include <linux/gfs2_ondisk.h>
#include <linux/lm_interface.h>
+#include <linux/bio.h>
#include "gfs2.h"
#include "incore.h"
@@ -172,26 +173,6 @@ static void inode_go_sync(struct gfs2_glock *gl)
}
/**
- * inode_go_xmote_bh - After promoting/demoting a glock
- * @gl: the glock
- *
- */
-
-static void inode_go_xmote_bh(struct gfs2_glock *gl)
-{
- struct gfs2_holder *gh = gl->gl_req_gh;
- struct buffer_head *bh;
- int error;
-
- if (gl->gl_state != LM_ST_UNLOCKED &&
- (!gh || !(gh->gh_flags & GL_SKIP))) {
- error = gfs2_meta_read(gl, gl->gl_name.ln_number, 0, &bh);
- if (!error)
- brelse(bh);
- }
-}
-
-/**
* inode_go_inval - prepare a inode glock to be released
* @gl: the glock
* @flags:
@@ -267,6 +248,26 @@ static int inode_go_lock(struct gfs2_holder *gh)
}
/**
+ * inode_go_dump - print information about an inode
+ * @seq: The iterator
+ * @ip: the inode
+ *
+ * Returns: 0 on success, -ENOBUFS when we run out of space
+ */
+
+static int inode_go_dump(struct seq_file *seq, const struct gfs2_glock *gl)
+{
+ const struct gfs2_inode *ip = gl->gl_object;
+ if (ip == NULL)
+ return 0;
+ gfs2_print_dbg(seq, " I: n:%llu/%llu t:%u f:0x%08lx\n",
+ (unsigned long long)ip->i_no_formal_ino,
+ (unsigned long long)ip->i_no_addr,
+ IF2DT(ip->i_inode.i_mode), ip->i_flags);
+ return 0;
+}
+
+/**
* rgrp_go_demote_ok - Check to see if it's ok to unlock a RG's glock
* @gl: the glock
*
@@ -306,6 +307,22 @@ static void rgrp_go_unlock(struct gfs2_holder *gh)
}
/**
+ * rgrp_go_dump - print out an rgrp
+ * @seq: The iterator
+ * @gl: The glock in question
+ *
+ */
+
+static int rgrp_go_dump(struct seq_file *seq, const struct gfs2_glock *gl)
+{
+ const struct gfs2_rgrpd *rgd = gl->gl_object;
+ if (rgd == NULL)
+ return 0;
+ gfs2_print_dbg(seq, " R: n:%llu\n", (unsigned long long)rgd->rd_addr);
+ return 0;
+}
+
+/**
* trans_go_sync - promote/demote the transaction glock
* @gl: the glock
* @state: the requested state
@@ -330,7 +347,7 @@ static void trans_go_sync(struct gfs2_glock *gl)
*
*/
-static void trans_go_xmote_bh(struct gfs2_glock *gl)
+static int trans_go_xmote_bh(struct gfs2_glock *gl, struct gfs2_holder *gh)
{
struct gfs2_sbd *sdp = gl->gl_sbd;
struct gfs2_inode *ip = GFS2_I(sdp->sd_jdesc->jd_inode);
@@ -338,8 +355,7 @@ static void trans_go_xmote_bh(struct gfs2_glock *gl)
struct gfs2_log_header_host head;
int error;
- if (gl->gl_state != LM_ST_UNLOCKED &&
- test_bit(SDF_JOURNAL_LIVE, &sdp->sd_flags)) {
+ if (test_bit(SDF_JOURNAL_LIVE, &sdp->sd_flags)) {
j_gl->gl_ops->go_inval(j_gl, DIO_METADATA);
error = gfs2_find_jhead(sdp->sd_jdesc, &head);
@@ -354,6 +370,7 @@ static void trans_go_xmote_bh(struct gfs2_glock *gl)
gfs2_log_pointers_init(sdp, head.lh_blkno);
}
}
+ return 0;
}
/**
@@ -375,12 +392,12 @@ const struct gfs2_glock_operations gfs2_meta_glops = {
const struct gfs2_glock_operations gfs2_inode_glops = {
.go_xmote_th = inode_go_sync,
- .go_xmote_bh = inode_go_xmote_bh,
.go_inval = inode_go_inval,
.go_demote_ok = inode_go_demote_ok,
.go_lock = inode_go_lock,
+ .go_dump = inode_go_dump,
.go_type = LM_TYPE_INODE,
- .go_min_hold_time = HZ / 10,
+ .go_min_hold_time = HZ / 5,
};
const struct gfs2_glock_operations gfs2_rgrp_glops = {
@@ -389,8 +406,9 @@ const struct gfs2_glock_operations gfs2_rgrp_glops = {
.go_demote_ok = rgrp_go_demote_ok,
.go_lock = rgrp_go_lock,
.go_unlock = rgrp_go_unlock,
+ .go_dump = rgrp_go_dump,
.go_type = LM_TYPE_RGRP,
- .go_min_hold_time = HZ / 10,
+ .go_min_hold_time = HZ / 5,
};
const struct gfs2_glock_operations gfs2_trans_glops = {
diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h
index eabe5ea..4b734c6 100644
--- a/fs/gfs2/incore.h
+++ b/fs/gfs2/incore.h
@@ -128,20 +128,20 @@ struct gfs2_bufdata {
struct gfs2_glock_operations {
void (*go_xmote_th) (struct gfs2_glock *gl);
- void (*go_xmote_bh) (struct gfs2_glock *gl);
+ int (*go_xmote_bh) (struct gfs2_glock *gl, struct gfs2_holder *gh);
void (*go_inval) (struct gfs2_glock *gl, int flags);
int (*go_demote_ok) (struct gfs2_glock *gl);
int (*go_lock) (struct gfs2_holder *gh);
void (*go_unlock) (struct gfs2_holder *gh);
+ int (*go_dump)(struct seq_file *seq, const struct gfs2_glock *gl);
const int go_type;
const unsigned long go_min_hold_time;
};
enum {
/* States */
- HIF_HOLDER = 6,
+ HIF_HOLDER = 6, /* Set for gh that "holds" the glock */
HIF_FIRST = 7,
- HIF_ABORTED = 9,
HIF_WAIT = 10,
};
@@ -154,20 +154,20 @@ struct gfs2_holder {
unsigned gh_flags;
int gh_error;
- unsigned long gh_iflags;
+ unsigned long gh_iflags; /* HIF_... */
unsigned long gh_ip;
};
enum {
- GLF_LOCK = 1,
- GLF_STICKY = 2,
- GLF_DEMOTE = 3,
- GLF_PENDING_DEMOTE = 4,
- GLF_DIRTY = 5,
- GLF_DEMOTE_IN_PROGRESS = 6,
- GLF_LFLUSH = 7,
- GLF_WAITERS2 = 8,
- GLF_CONV_DEADLK = 9,
+ GLF_LOCK = 1,
+ GLF_STICKY = 2,
+ GLF_DEMOTE = 3,
+ GLF_PENDING_DEMOTE = 4,
+ GLF_DEMOTE_IN_PROGRESS = 5,
+ GLF_DIRTY = 6,
+ GLF_LFLUSH = 7,
+ GLF_INVALIDATE_IN_PROGRESS = 8,
+ GLF_REPLY_PENDING = 9,
};
struct gfs2_glock {
@@ -179,19 +179,14 @@ struct gfs2_glock {
spinlock_t gl_spin;
unsigned int gl_state;
+ unsigned int gl_target;
+ unsigned int gl_reply;
unsigned int gl_hash;
unsigned int gl_demote_state; /* state requested by remote node */
unsigned long gl_demote_time; /* time of first demote request */
- struct pid *gl_owner_pid;
- unsigned long gl_ip;
struct list_head gl_holders;
- struct list_head gl_waiters1; /* HIF_MUTEX */
- struct list_head gl_waiters3; /* HIF_PROMOTE */
const struct gfs2_glock_operations *gl_ops;
-
- struct gfs2_holder *gl_req_gh;
-
void *gl_lock;
char *gl_lvb;
atomic_t gl_lvb_count;
diff --git a/fs/gfs2/locking/dlm/lock.c b/fs/gfs2/locking/dlm/lock.c
index cf7ea8a..fed9a67 100644
--- a/fs/gfs2/locking/dlm/lock.c
+++ b/fs/gfs2/locking/dlm/lock.c
@@ -308,6 +308,9 @@ unsigned int gdlm_lock(void *lock, unsigned int cur_state,
{
struct gdlm_lock *lp = lock;
+ if (req_state == LM_ST_UNLOCKED)
+ return gdlm_unlock(lock, cur_state);
+
clear_bit(LFL_DLM_CANCEL, &lp->flags);
if (flags & LM_FLAG_NOEXP)
set_bit(LFL_NOBLOCK, &lp->flags);
diff --git a/fs/gfs2/locking/nolock/main.c b/fs/gfs2/locking/nolock/main.c
index 284a5ec..627bfb7 100644
--- a/fs/gfs2/locking/nolock/main.c
+++ b/fs/gfs2/locking/nolock/main.c
@@ -107,6 +107,8 @@ static void nolock_put_lock(void *lock)
static unsigned int nolock_lock(void *lock, unsigned int cur_state,
unsigned int req_state, unsigned int flags)
{
+ if (req_state == LM_ST_UNLOCKED)
+ return 0;
return req_state | LM_OUT_CACHEABLE;
}
diff --git a/fs/gfs2/main.c b/fs/gfs2/main.c
index 053e2eb..bcc668d 100644
--- a/fs/gfs2/main.c
+++ b/fs/gfs2/main.c
@@ -40,8 +40,6 @@ static void gfs2_init_glock_once(struct kmem_cache *cachep, void *foo)
INIT_HLIST_NODE(&gl->gl_list);
spin_lock_init(&gl->gl_spin);
INIT_LIST_HEAD(&gl->gl_holders);
- INIT_LIST_HEAD(&gl->gl_waiters1);
- INIT_LIST_HEAD(&gl->gl_waiters3);
gl->gl_lvb = NULL;
atomic_set(&gl->gl_lvb_count, 0);
INIT_LIST_HEAD(&gl->gl_reclaim);
diff --git a/fs/gfs2/meta_io.c b/fs/gfs2/meta_io.c
index 78d75f8..0985362 100644
--- a/fs/gfs2/meta_io.c
+++ b/fs/gfs2/meta_io.c
@@ -129,7 +129,7 @@ void gfs2_meta_sync(struct gfs2_glock *gl)
}
/**
- * getbuf - Get a buffer with a given address space
+ * gfs2_getbuf - Get a buffer with a given address space
* @gl: the glock
* @blkno: the block number (filesystem scope)
* @create: 1 if the buffer should be created
@@ -137,7 +137,7 @@ void gfs2_meta_sync(struct gfs2_glock *gl)
* Returns: the buffer
*/
-static struct buffer_head *getbuf(struct gfs2_glock *gl, u64 blkno, int create)
+struct buffer_head *gfs2_getbuf(struct gfs2_glock *gl, u64 blkno, int create)
{
struct address_space *mapping = gl->gl_aspace->i_mapping;
struct gfs2_sbd *sdp = gl->gl_sbd;
@@ -205,7 +205,7 @@ static void meta_prep_new(struct buffer_head *bh)
struct buffer_head *gfs2_meta_new(struct gfs2_glock *gl, u64 blkno)
{
struct buffer_head *bh;
- bh = getbuf(gl, blkno, CREATE);
+ bh = gfs2_getbuf(gl, blkno, CREATE);
meta_prep_new(bh);
return bh;
}
@@ -223,7 +223,7 @@ struct buffer_head *gfs2_meta_new(struct gfs2_glock *gl, u64 blkno)
int gfs2_meta_read(struct gfs2_glock *gl, u64 blkno, int flags,
struct buffer_head **bhp)
{
- *bhp = getbuf(gl, blkno, CREATE);
+ *bhp = gfs2_getbuf(gl, blkno, CREATE);
if (!buffer_uptodate(*bhp)) {
ll_rw_block(READ_META, 1, bhp);
if (flags & DIO_WAIT) {
@@ -346,7 +346,7 @@ void gfs2_meta_wipe(struct gfs2_inode *ip, u64 bstart, u32 blen)
struct buffer_head *bh;
while (blen) {
- bh = getbuf(ip->i_gl, bstart, NO_CREATE);
+ bh = gfs2_getbuf(ip->i_gl, bstart, NO_CREATE);
if (bh) {
lock_buffer(bh);
gfs2_log_lock(sdp);
@@ -421,7 +421,7 @@ struct buffer_head *gfs2_meta_ra(struct gfs2_glock *gl, u64 dblock, u32 extlen)
if (extlen > max_ra)
extlen = max_ra;
- first_bh = getbuf(gl, dblock, CREATE);
+ first_bh = gfs2_getbuf(gl, dblock, CREATE);
if (buffer_uptodate(first_bh))
goto out;
@@ -432,7 +432,7 @@ struct buffer_head *gfs2_meta_ra(struct gfs2_glock *gl, u64 dblock, u32 extlen)
extlen--;
while (extlen) {
- bh = getbuf(gl, dblock, CREATE);
+ bh = gfs2_getbuf(gl, dblock, CREATE);
if (!buffer_uptodate(bh) && !buffer_locked(bh))
ll_rw_block(READA, 1, &bh);
diff --git a/fs/gfs2/meta_io.h b/fs/gfs2/meta_io.h
index 73e3b1c..b1a5f36 100644
--- a/fs/gfs2/meta_io.h
+++ b/fs/gfs2/meta_io.h
@@ -47,6 +47,7 @@ struct buffer_head *gfs2_meta_new(struct gfs2_glock *gl, u64 blkno);
int gfs2_meta_read(struct gfs2_glock *gl, u64 blkno,
int flags, struct buffer_head **bhp);
int gfs2_meta_wait(struct gfs2_sbd *sdp, struct buffer_head *bh);
+struct buffer_head *gfs2_getbuf(struct gfs2_glock *gl, u64 blkno, int create);
void gfs2_attach_bufdata(struct gfs2_glock *gl, struct buffer_head *bh,
int meta);
diff --git a/fs/gfs2/ops_address.c b/fs/gfs2/ops_address.c
index f55394e..2b556dd 100644
--- a/fs/gfs2/ops_address.c
+++ b/fs/gfs2/ops_address.c
@@ -507,26 +507,23 @@ static int __gfs2_readpage(void *file, struct page *page)
static int gfs2_readpage(struct file *file, struct page *page)
{
struct gfs2_inode *ip = GFS2_I(page->mapping->host);
- struct gfs2_holder *gh;
+ struct gfs2_holder gh;
int error;
- gh = gfs2_glock_is_locked_by_me(ip->i_gl);
- if (!gh) {
- gh = kmalloc(sizeof(struct gfs2_holder), GFP_NOFS);
- if (!gh)
- return -ENOBUFS;
- gfs2_holder_init(ip->i_gl, LM_ST_SHARED, GL_ATIME, gh);
+ gfs2_holder_init(ip->i_gl, LM_ST_SHARED, GL_ATIME|LM_FLAG_TRY_1CB, &gh);
+ error = gfs2_glock_nq_atime(&gh);
+ if (unlikely(error)) {
unlock_page(page);
- error = gfs2_glock_nq_atime(gh);
- if (likely(error != 0))
- goto out;
- return AOP_TRUNCATED_PAGE;
+ goto out;
}
error = __gfs2_readpage(file, page);
- gfs2_glock_dq(gh);
+ gfs2_glock_dq(&gh);
out:
- gfs2_holder_uninit(gh);
- kfree(gh);
+ gfs2_holder_uninit(&gh);
+ if (error == GLR_TRYFAILED) {
+ yield();
+ return AOP_TRUNCATED_PAGE;
+ }
return error;
}
diff --git a/fs/gfs2/ops_file.c b/fs/gfs2/ops_file.c
index e1b7d52..0ff512a 100644
--- a/fs/gfs2/ops_file.c
+++ b/fs/gfs2/ops_file.c
@@ -669,8 +669,7 @@ static int do_flock(struct file *file, int cmd, struct file_lock *fl)
int error = 0;
state = (fl->fl_type == F_WRLCK) ? LM_ST_EXCLUSIVE : LM_ST_SHARED;
- flags = (IS_SETLKW(cmd) ? 0 : LM_FLAG_TRY) | GL_EXACT | GL_NOCACHE
- | GL_FLOCK;
+ flags = (IS_SETLKW(cmd) ? 0 : LM_FLAG_TRY) | GL_EXACT | GL_NOCACHE;
mutex_lock(&fp->f_fl_mutex);
@@ -683,9 +682,8 @@ static int do_flock(struct file *file, int cmd, struct file_lock *fl)
gfs2_glock_dq_wait(fl_gh);
gfs2_holder_reinit(state, flags, fl_gh);
} else {
- error = gfs2_glock_get(GFS2_SB(&ip->i_inode),
- ip->i_no_addr, &gfs2_flock_glops,
- CREATE, &gl);
+ error = gfs2_glock_get(GFS2_SB(&ip->i_inode), ip->i_no_addr,
+ &gfs2_flock_glops, CREATE, &gl);
if (error)
goto out;
gfs2_holder_init(gl, state, flags, fl_gh);
diff --git a/fs/gfs2/recovery.c b/fs/gfs2/recovery.c
index 2888e4b..fdd3f0f 100644
--- a/fs/gfs2/recovery.c
+++ b/fs/gfs2/recovery.c
@@ -505,7 +505,7 @@ int gfs2_recover_journal(struct gfs2_jdesc *jd)
error = gfs2_glock_nq_init(sdp->sd_trans_gl, LM_ST_SHARED,
LM_FLAG_NOEXP | LM_FLAG_PRIORITY |
- GL_NOCANCEL | GL_NOCACHE, &t_gh);
+ GL_NOCACHE, &t_gh);
if (error)
goto fail_gunlock_ji;
diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index 7aeacbc..12fe38f 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -941,8 +941,7 @@ static int gfs2_lock_fs_check_clean(struct gfs2_sbd *sdp,
}
error = gfs2_glock_nq_init(sdp->sd_trans_gl, LM_ST_DEFERRED,
- LM_FLAG_PRIORITY | GL_NOCACHE,
- t_gh);
+ GL_NOCACHE, t_gh);
list_for_each_entry(jd, &sdp->sd_jindex_list, jd_list) {
error = gfs2_jdesc_check(jd);
--
1.5.1.2
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [Cluster-devel] [PATCH 02/18] [GFS2] Fix ordering bug in lock_dlm
2008-07-11 10:11 ` [Cluster-devel] [PATCH 01/18] [GFS2] Clean up the glock core swhiteho
@ 2008-07-11 10:11 ` swhiteho
2008-07-11 10:11 ` [Cluster-devel] [PATCH 03/18] [GFS2] No lock_nolock swhiteho
0 siblings, 1 reply; 35+ messages in thread
From: swhiteho @ 2008-07-11 10:11 UTC (permalink / raw)
To: cluster-devel.redhat.com
From: Steven Whitehouse <swhiteho@redhat.com>
This looks like a lot of change, but in fact its not. Mostly its
things moving from one file to another. The change is just that
instead of queuing lock completions and callbacks from the DLM
we now pass them directly to GFS2.
This gives us a net loss of two list heads per glock (a fair
saving in memory) plus a reduction in the latency of delivering
the messages to GFS2, plus we now have one thread fewer as well.
There was a bug where callbacks and completions could be delivered
in the wrong order due to this unnecessary queuing which is fixed
by this patch.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Bob Peterson <rpeterso@redhat.com>
diff --git a/fs/gfs2/locking/dlm/lock.c b/fs/gfs2/locking/dlm/lock.c
index fed9a67..871ffc9 100644
--- a/fs/gfs2/locking/dlm/lock.c
+++ b/fs/gfs2/locking/dlm/lock.c
@@ -11,46 +11,63 @@
static char junk_lvb[GDLM_LVB_SIZE];
-static void queue_complete(struct gdlm_lock *lp)
+
+/* convert dlm lock-mode to gfs lock-state */
+
+static s16 gdlm_make_lmstate(s16 dlmmode)
{
- struct gdlm_ls *ls = lp->ls;
+ switch (dlmmode) {
+ case DLM_LOCK_IV:
+ case DLM_LOCK_NL:
+ return LM_ST_UNLOCKED;
+ case DLM_LOCK_EX:
+ return LM_ST_EXCLUSIVE;
+ case DLM_LOCK_CW:
+ return LM_ST_DEFERRED;
+ case DLM_LOCK_PR:
+ return LM_ST_SHARED;
+ }
+ gdlm_assert(0, "unknown DLM mode %d", dlmmode);
+ return -1;
+}
- clear_bit(LFL_ACTIVE, &lp->flags);
+/* A lock placed on this queue is re-submitted to DLM as soon as the lock_dlm
+ thread gets to it. */
+
+static void queue_submit(struct gdlm_lock *lp)
+{
+ struct gdlm_ls *ls = lp->ls;
spin_lock(&ls->async_lock);
- list_add_tail(&lp->clist, &ls->complete);
+ list_add_tail(&lp->delay_list, &ls->submit);
spin_unlock(&ls->async_lock);
wake_up(&ls->thread_wait);
}
-static inline void gdlm_ast(void *astarg)
+static void wake_up_ast(struct gdlm_lock *lp)
{
- queue_complete(astarg);
+ clear_bit(LFL_AST_WAIT, &lp->flags);
+ smp_mb__after_clear_bit();
+ wake_up_bit(&lp->flags, LFL_AST_WAIT);
}
-static inline void gdlm_bast(void *astarg, int mode)
+static void gdlm_delete_lp(struct gdlm_lock *lp)
{
- struct gdlm_lock *lp = astarg;
struct gdlm_ls *ls = lp->ls;
- if (!mode) {
- printk(KERN_INFO "lock_dlm: bast mode zero %x,%llx\n",
- lp->lockname.ln_type,
- (unsigned long long)lp->lockname.ln_number);
- return;
- }
-
spin_lock(&ls->async_lock);
- if (!lp->bast_mode) {
- list_add_tail(&lp->blist, &ls->blocking);
- lp->bast_mode = mode;
- } else if (lp->bast_mode < mode)
- lp->bast_mode = mode;
+ if (!list_empty(&lp->delay_list))
+ list_del_init(&lp->delay_list);
+ gdlm_assert(!list_empty(&lp->all_list), "%x,%llx", lp->lockname.ln_type,
+ (unsigned long long)lp->lockname.ln_number);
+ list_del_init(&lp->all_list);
+ ls->all_locks_count--;
spin_unlock(&ls->async_lock);
- wake_up(&ls->thread_wait);
+
+ kfree(lp);
}
-void gdlm_queue_delayed(struct gdlm_lock *lp)
+static void gdlm_queue_delayed(struct gdlm_lock *lp)
{
struct gdlm_ls *ls = lp->ls;
@@ -59,6 +76,249 @@ void gdlm_queue_delayed(struct gdlm_lock *lp)
spin_unlock(&ls->async_lock);
}
+static void process_complete(struct gdlm_lock *lp)
+{
+ struct gdlm_ls *ls = lp->ls;
+ struct lm_async_cb acb;
+ s16 prev_mode = lp->cur;
+
+ memset(&acb, 0, sizeof(acb));
+
+ if (lp->lksb.sb_status == -DLM_ECANCEL) {
+ log_info("complete dlm cancel %x,%llx flags %lx",
+ lp->lockname.ln_type,
+ (unsigned long long)lp->lockname.ln_number,
+ lp->flags);
+
+ lp->req = lp->cur;
+ acb.lc_ret |= LM_OUT_CANCELED;
+ if (lp->cur == DLM_LOCK_IV)
+ lp->lksb.sb_lkid = 0;
+ goto out;
+ }
+
+ if (test_and_clear_bit(LFL_DLM_UNLOCK, &lp->flags)) {
+ if (lp->lksb.sb_status != -DLM_EUNLOCK) {
+ log_info("unlock sb_status %d %x,%llx flags %lx",
+ lp->lksb.sb_status, lp->lockname.ln_type,
+ (unsigned long long)lp->lockname.ln_number,
+ lp->flags);
+ return;
+ }
+
+ lp->cur = DLM_LOCK_IV;
+ lp->req = DLM_LOCK_IV;
+ lp->lksb.sb_lkid = 0;
+
+ if (test_and_clear_bit(LFL_UNLOCK_DELETE, &lp->flags)) {
+ gdlm_delete_lp(lp);
+ return;
+ }
+ goto out;
+ }
+
+ if (lp->lksb.sb_flags & DLM_SBF_VALNOTVALID)
+ memset(lp->lksb.sb_lvbptr, 0, GDLM_LVB_SIZE);
+
+ if (lp->lksb.sb_flags & DLM_SBF_ALTMODE) {
+ if (lp->req == DLM_LOCK_PR)
+ lp->req = DLM_LOCK_CW;
+ else if (lp->req == DLM_LOCK_CW)
+ lp->req = DLM_LOCK_PR;
+ }
+
+ /*
+ * A canceled lock request. The lock was just taken off the delayed
+ * list and was never even submitted to dlm.
+ */
+
+ if (test_and_clear_bit(LFL_CANCEL, &lp->flags)) {
+ log_info("complete internal cancel %x,%llx",
+ lp->lockname.ln_type,
+ (unsigned long long)lp->lockname.ln_number);
+ lp->req = lp->cur;
+ acb.lc_ret |= LM_OUT_CANCELED;
+ goto out;
+ }
+
+ /*
+ * An error occured.
+ */
+
+ if (lp->lksb.sb_status) {
+ /* a "normal" error */
+ if ((lp->lksb.sb_status == -EAGAIN) &&
+ (lp->lkf & DLM_LKF_NOQUEUE)) {
+ lp->req = lp->cur;
+ if (lp->cur == DLM_LOCK_IV)
+ lp->lksb.sb_lkid = 0;
+ goto out;
+ }
+
+ /* this could only happen with cancels I think */
+ log_info("ast sb_status %d %x,%llx flags %lx",
+ lp->lksb.sb_status, lp->lockname.ln_type,
+ (unsigned long long)lp->lockname.ln_number,
+ lp->flags);
+ if (lp->lksb.sb_status == -EDEADLOCK &&
+ lp->ls->fsflags & LM_MFLAG_CONV_NODROP) {
+ lp->req = lp->cur;
+ acb.lc_ret |= LM_OUT_CONV_DEADLK;
+ if (lp->cur == DLM_LOCK_IV)
+ lp->lksb.sb_lkid = 0;
+ goto out;
+ } else
+ return;
+ }
+
+ /*
+ * This is an AST for an EX->EX conversion for sync_lvb from GFS.
+ */
+
+ if (test_and_clear_bit(LFL_SYNC_LVB, &lp->flags)) {
+ wake_up_ast(lp);
+ return;
+ }
+
+ /*
+ * A lock has been demoted to NL because it initially completed during
+ * BLOCK_LOCKS. Now it must be requested in the originally requested
+ * mode.
+ */
+
+ if (test_and_clear_bit(LFL_REREQUEST, &lp->flags)) {
+ gdlm_assert(lp->req == DLM_LOCK_NL, "%x,%llx",
+ lp->lockname.ln_type,
+ (unsigned long long)lp->lockname.ln_number);
+ gdlm_assert(lp->prev_req > DLM_LOCK_NL, "%x,%llx",
+ lp->lockname.ln_type,
+ (unsigned long long)lp->lockname.ln_number);
+
+ lp->cur = DLM_LOCK_NL;
+ lp->req = lp->prev_req;
+ lp->prev_req = DLM_LOCK_IV;
+ lp->lkf &= ~DLM_LKF_CONVDEADLK;
+
+ set_bit(LFL_NOCACHE, &lp->flags);
+
+ if (test_bit(DFL_BLOCK_LOCKS, &ls->flags) &&
+ !test_bit(LFL_NOBLOCK, &lp->flags))
+ gdlm_queue_delayed(lp);
+ else
+ queue_submit(lp);
+ return;
+ }
+
+ /*
+ * A request is granted during dlm recovery. It may be granted
+ * because the locks of a failed node were cleared. In that case,
+ * there may be inconsistent data beneath this lock and we must wait
+ * for recovery to complete to use it. When gfs recovery is done this
+ * granted lock will be converted to NL and then reacquired in this
+ * granted state.
+ */
+
+ if (test_bit(DFL_BLOCK_LOCKS, &ls->flags) &&
+ !test_bit(LFL_NOBLOCK, &lp->flags) &&
+ lp->req != DLM_LOCK_NL) {
+
+ lp->cur = lp->req;
+ lp->prev_req = lp->req;
+ lp->req = DLM_LOCK_NL;
+ lp->lkf |= DLM_LKF_CONVERT;
+ lp->lkf &= ~DLM_LKF_CONVDEADLK;
+
+ log_debug("rereq %x,%llx id %x %d,%d",
+ lp->lockname.ln_type,
+ (unsigned long long)lp->lockname.ln_number,
+ lp->lksb.sb_lkid, lp->cur, lp->req);
+
+ set_bit(LFL_REREQUEST, &lp->flags);
+ queue_submit(lp);
+ return;
+ }
+
+ /*
+ * DLM demoted the lock to NL before it was granted so GFS must be
+ * told it cannot cache data for this lock.
+ */
+
+ if (lp->lksb.sb_flags & DLM_SBF_DEMOTED)
+ set_bit(LFL_NOCACHE, &lp->flags);
+
+out:
+ /*
+ * This is an internal lock_dlm lock
+ */
+
+ if (test_bit(LFL_INLOCK, &lp->flags)) {
+ clear_bit(LFL_NOBLOCK, &lp->flags);
+ lp->cur = lp->req;
+ wake_up_ast(lp);
+ return;
+ }
+
+ /*
+ * Normal completion of a lock request. Tell GFS it now has the lock.
+ */
+
+ clear_bit(LFL_NOBLOCK, &lp->flags);
+ lp->cur = lp->req;
+
+ acb.lc_name = lp->lockname;
+ acb.lc_ret |= gdlm_make_lmstate(lp->cur);
+
+ if (!test_and_clear_bit(LFL_NOCACHE, &lp->flags) &&
+ (lp->cur > DLM_LOCK_NL) && (prev_mode > DLM_LOCK_NL))
+ acb.lc_ret |= LM_OUT_CACHEABLE;
+
+ ls->fscb(ls->sdp, LM_CB_ASYNC, &acb);
+}
+
+static void gdlm_ast(void *astarg)
+{
+ struct gdlm_lock *lp = astarg;
+ clear_bit(LFL_ACTIVE, &lp->flags);
+ process_complete(lp);
+}
+
+static void process_blocking(struct gdlm_lock *lp, int bast_mode)
+{
+ struct gdlm_ls *ls = lp->ls;
+ unsigned int cb = 0;
+
+ switch (gdlm_make_lmstate(bast_mode)) {
+ case LM_ST_EXCLUSIVE:
+ cb = LM_CB_NEED_E;
+ break;
+ case LM_ST_DEFERRED:
+ cb = LM_CB_NEED_D;
+ break;
+ case LM_ST_SHARED:
+ cb = LM_CB_NEED_S;
+ break;
+ default:
+ gdlm_assert(0, "unknown bast mode %u", bast_mode);
+ }
+
+ ls->fscb(ls->sdp, cb, &lp->lockname);
+}
+
+
+static void gdlm_bast(void *astarg, int mode)
+{
+ struct gdlm_lock *lp = astarg;
+
+ if (!mode) {
+ printk(KERN_INFO "lock_dlm: bast mode zero %x,%llx\n",
+ lp->lockname.ln_type,
+ (unsigned long long)lp->lockname.ln_number);
+ return;
+ }
+
+ process_blocking(lp, mode);
+}
+
/* convert gfs lock-state to dlm lock-mode */
static s16 make_mode(s16 lmstate)
@@ -77,24 +337,6 @@ static s16 make_mode(s16 lmstate)
return -1;
}
-/* convert dlm lock-mode to gfs lock-state */
-
-s16 gdlm_make_lmstate(s16 dlmmode)
-{
- switch (dlmmode) {
- case DLM_LOCK_IV:
- case DLM_LOCK_NL:
- return LM_ST_UNLOCKED;
- case DLM_LOCK_EX:
- return LM_ST_EXCLUSIVE;
- case DLM_LOCK_CW:
- return LM_ST_DEFERRED;
- case DLM_LOCK_PR:
- return LM_ST_SHARED;
- }
- gdlm_assert(0, "unknown DLM mode %d", dlmmode);
- return -1;
-}
/* verify agreement with GFS on the current lock state, NB: DLM_LOCK_NL and
DLM_LOCK_IV are both considered LM_ST_UNLOCKED by GFS. */
@@ -173,10 +415,6 @@ static int gdlm_create_lp(struct gdlm_ls *ls, struct lm_lockname *name,
make_strname(name, &lp->strname);
lp->ls = ls;
lp->cur = DLM_LOCK_IV;
- lp->lvb = NULL;
- lp->hold_null = NULL;
- INIT_LIST_HEAD(&lp->clist);
- INIT_LIST_HEAD(&lp->blist);
INIT_LIST_HEAD(&lp->delay_list);
spin_lock(&ls->async_lock);
@@ -188,26 +426,6 @@ static int gdlm_create_lp(struct gdlm_ls *ls, struct lm_lockname *name,
return 0;
}
-void gdlm_delete_lp(struct gdlm_lock *lp)
-{
- struct gdlm_ls *ls = lp->ls;
-
- spin_lock(&ls->async_lock);
- if (!list_empty(&lp->clist))
- list_del_init(&lp->clist);
- if (!list_empty(&lp->blist))
- list_del_init(&lp->blist);
- if (!list_empty(&lp->delay_list))
- list_del_init(&lp->delay_list);
- gdlm_assert(!list_empty(&lp->all_list), "%x,%llx", lp->lockname.ln_type,
- (unsigned long long)lp->lockname.ln_number);
- list_del_init(&lp->all_list);
- ls->all_locks_count--;
- spin_unlock(&ls->async_lock);
-
- kfree(lp);
-}
-
int gdlm_get_lock(void *lockspace, struct lm_lockname *name,
void **lockp)
{
@@ -261,7 +479,7 @@ unsigned int gdlm_do_lock(struct gdlm_lock *lp)
if ((error == -EAGAIN) && (lp->lkf & DLM_LKF_NOQUEUE)) {
lp->lksb.sb_status = -EAGAIN;
- queue_complete(lp);
+ gdlm_ast(lp);
error = 0;
}
@@ -311,6 +529,9 @@ unsigned int gdlm_lock(void *lock, unsigned int cur_state,
if (req_state == LM_ST_UNLOCKED)
return gdlm_unlock(lock, cur_state);
+ if (req_state == LM_ST_UNLOCKED)
+ return gdlm_unlock(lock, cur_state);
+
clear_bit(LFL_DLM_CANCEL, &lp->flags);
if (flags & LM_FLAG_NOEXP)
set_bit(LFL_NOBLOCK, &lp->flags);
@@ -354,7 +575,7 @@ void gdlm_cancel(void *lock)
if (delay_list) {
set_bit(LFL_CANCEL, &lp->flags);
set_bit(LFL_ACTIVE, &lp->flags);
- queue_complete(lp);
+ gdlm_ast(lp);
return;
}
diff --git a/fs/gfs2/locking/dlm/lock_dlm.h b/fs/gfs2/locking/dlm/lock_dlm.h
index a243cf6..ad944c6 100644
--- a/fs/gfs2/locking/dlm/lock_dlm.h
+++ b/fs/gfs2/locking/dlm/lock_dlm.h
@@ -72,15 +72,12 @@ struct gdlm_ls {
int recover_jid_done;
int recover_jid_status;
spinlock_t async_lock;
- struct list_head complete;
- struct list_head blocking;
struct list_head delayed;
struct list_head submit;
struct list_head all_locks;
u32 all_locks_count;
wait_queue_head_t wait_control;
- struct task_struct *thread1;
- struct task_struct *thread2;
+ struct task_struct *thread;
wait_queue_head_t thread_wait;
unsigned long drop_time;
int drop_locks_count;
@@ -117,10 +114,6 @@ struct gdlm_lock {
u32 lkf; /* dlm flags DLM_LKF_ */
unsigned long flags; /* lock_dlm flags LFL_ */
- int bast_mode; /* protected by async_lock */
-
- struct list_head clist; /* complete */
- struct list_head blist; /* blocking */
struct list_head delay_list; /* delayed */
struct list_head all_list; /* all locks for the fs */
struct gdlm_lock *hold_null; /* NL lock for hold_lvb */
@@ -159,11 +152,8 @@ void gdlm_release_threads(struct gdlm_ls *);
/* lock.c */
-s16 gdlm_make_lmstate(s16);
-void gdlm_queue_delayed(struct gdlm_lock *);
void gdlm_submit_delayed(struct gdlm_ls *);
int gdlm_release_all_locks(struct gdlm_ls *);
-void gdlm_delete_lp(struct gdlm_lock *);
unsigned int gdlm_do_lock(struct gdlm_lock *);
int gdlm_get_lock(void *, struct lm_lockname *, void **);
diff --git a/fs/gfs2/locking/dlm/mount.c b/fs/gfs2/locking/dlm/mount.c
index 470bdf6..0628520 100644
--- a/fs/gfs2/locking/dlm/mount.c
+++ b/fs/gfs2/locking/dlm/mount.c
@@ -28,15 +28,11 @@ static struct gdlm_ls *init_gdlm(lm_callback_t cb, struct gfs2_sbd *sdp,
ls->sdp = sdp;
ls->fsflags = flags;
spin_lock_init(&ls->async_lock);
- INIT_LIST_HEAD(&ls->complete);
- INIT_LIST_HEAD(&ls->blocking);
INIT_LIST_HEAD(&ls->delayed);
INIT_LIST_HEAD(&ls->submit);
INIT_LIST_HEAD(&ls->all_locks);
init_waitqueue_head(&ls->thread_wait);
init_waitqueue_head(&ls->wait_control);
- ls->thread1 = NULL;
- ls->thread2 = NULL;
ls->drop_time = jiffies;
ls->jid = -1;
diff --git a/fs/gfs2/locking/dlm/thread.c b/fs/gfs2/locking/dlm/thread.c
index e53db6f..f30350a 100644
--- a/fs/gfs2/locking/dlm/thread.c
+++ b/fs/gfs2/locking/dlm/thread.c
@@ -9,255 +9,12 @@
#include "lock_dlm.h"
-/* A lock placed on this queue is re-submitted to DLM as soon as the lock_dlm
- thread gets to it. */
-
-static void queue_submit(struct gdlm_lock *lp)
-{
- struct gdlm_ls *ls = lp->ls;
-
- spin_lock(&ls->async_lock);
- list_add_tail(&lp->delay_list, &ls->submit);
- spin_unlock(&ls->async_lock);
- wake_up(&ls->thread_wait);
-}
-
-static void process_blocking(struct gdlm_lock *lp, int bast_mode)
-{
- struct gdlm_ls *ls = lp->ls;
- unsigned int cb = 0;
-
- switch (gdlm_make_lmstate(bast_mode)) {
- case LM_ST_EXCLUSIVE:
- cb = LM_CB_NEED_E;
- break;
- case LM_ST_DEFERRED:
- cb = LM_CB_NEED_D;
- break;
- case LM_ST_SHARED:
- cb = LM_CB_NEED_S;
- break;
- default:
- gdlm_assert(0, "unknown bast mode %u", lp->bast_mode);
- }
-
- ls->fscb(ls->sdp, cb, &lp->lockname);
-}
-
-static void wake_up_ast(struct gdlm_lock *lp)
-{
- clear_bit(LFL_AST_WAIT, &lp->flags);
- smp_mb__after_clear_bit();
- wake_up_bit(&lp->flags, LFL_AST_WAIT);
-}
-
-static void process_complete(struct gdlm_lock *lp)
-{
- struct gdlm_ls *ls = lp->ls;
- struct lm_async_cb acb;
- s16 prev_mode = lp->cur;
-
- memset(&acb, 0, sizeof(acb));
-
- if (lp->lksb.sb_status == -DLM_ECANCEL) {
- log_info("complete dlm cancel %x,%llx flags %lx",
- lp->lockname.ln_type,
- (unsigned long long)lp->lockname.ln_number,
- lp->flags);
-
- lp->req = lp->cur;
- acb.lc_ret |= LM_OUT_CANCELED;
- if (lp->cur == DLM_LOCK_IV)
- lp->lksb.sb_lkid = 0;
- goto out;
- }
-
- if (test_and_clear_bit(LFL_DLM_UNLOCK, &lp->flags)) {
- if (lp->lksb.sb_status != -DLM_EUNLOCK) {
- log_info("unlock sb_status %d %x,%llx flags %lx",
- lp->lksb.sb_status, lp->lockname.ln_type,
- (unsigned long long)lp->lockname.ln_number,
- lp->flags);
- return;
- }
-
- lp->cur = DLM_LOCK_IV;
- lp->req = DLM_LOCK_IV;
- lp->lksb.sb_lkid = 0;
-
- if (test_and_clear_bit(LFL_UNLOCK_DELETE, &lp->flags)) {
- gdlm_delete_lp(lp);
- return;
- }
- goto out;
- }
-
- if (lp->lksb.sb_flags & DLM_SBF_VALNOTVALID)
- memset(lp->lksb.sb_lvbptr, 0, GDLM_LVB_SIZE);
-
- if (lp->lksb.sb_flags & DLM_SBF_ALTMODE) {
- if (lp->req == DLM_LOCK_PR)
- lp->req = DLM_LOCK_CW;
- else if (lp->req == DLM_LOCK_CW)
- lp->req = DLM_LOCK_PR;
- }
-
- /*
- * A canceled lock request. The lock was just taken off the delayed
- * list and was never even submitted to dlm.
- */
-
- if (test_and_clear_bit(LFL_CANCEL, &lp->flags)) {
- log_info("complete internal cancel %x,%llx",
- lp->lockname.ln_type,
- (unsigned long long)lp->lockname.ln_number);
- lp->req = lp->cur;
- acb.lc_ret |= LM_OUT_CANCELED;
- goto out;
- }
-
- /*
- * An error occured.
- */
-
- if (lp->lksb.sb_status) {
- /* a "normal" error */
- if ((lp->lksb.sb_status == -EAGAIN) &&
- (lp->lkf & DLM_LKF_NOQUEUE)) {
- lp->req = lp->cur;
- if (lp->cur == DLM_LOCK_IV)
- lp->lksb.sb_lkid = 0;
- goto out;
- }
-
- /* this could only happen with cancels I think */
- log_info("ast sb_status %d %x,%llx flags %lx",
- lp->lksb.sb_status, lp->lockname.ln_type,
- (unsigned long long)lp->lockname.ln_number,
- lp->flags);
- if (lp->lksb.sb_status == -EDEADLOCK &&
- lp->ls->fsflags & LM_MFLAG_CONV_NODROP) {
- lp->req = lp->cur;
- acb.lc_ret |= LM_OUT_CONV_DEADLK;
- if (lp->cur == DLM_LOCK_IV)
- lp->lksb.sb_lkid = 0;
- goto out;
- } else
- return;
- }
-
- /*
- * This is an AST for an EX->EX conversion for sync_lvb from GFS.
- */
-
- if (test_and_clear_bit(LFL_SYNC_LVB, &lp->flags)) {
- wake_up_ast(lp);
- return;
- }
-
- /*
- * A lock has been demoted to NL because it initially completed during
- * BLOCK_LOCKS. Now it must be requested in the originally requested
- * mode.
- */
-
- if (test_and_clear_bit(LFL_REREQUEST, &lp->flags)) {
- gdlm_assert(lp->req == DLM_LOCK_NL, "%x,%llx",
- lp->lockname.ln_type,
- (unsigned long long)lp->lockname.ln_number);
- gdlm_assert(lp->prev_req > DLM_LOCK_NL, "%x,%llx",
- lp->lockname.ln_type,
- (unsigned long long)lp->lockname.ln_number);
-
- lp->cur = DLM_LOCK_NL;
- lp->req = lp->prev_req;
- lp->prev_req = DLM_LOCK_IV;
- lp->lkf &= ~DLM_LKF_CONVDEADLK;
-
- set_bit(LFL_NOCACHE, &lp->flags);
-
- if (test_bit(DFL_BLOCK_LOCKS, &ls->flags) &&
- !test_bit(LFL_NOBLOCK, &lp->flags))
- gdlm_queue_delayed(lp);
- else
- queue_submit(lp);
- return;
- }
-
- /*
- * A request is granted during dlm recovery. It may be granted
- * because the locks of a failed node were cleared. In that case,
- * there may be inconsistent data beneath this lock and we must wait
- * for recovery to complete to use it. When gfs recovery is done this
- * granted lock will be converted to NL and then reacquired in this
- * granted state.
- */
-
- if (test_bit(DFL_BLOCK_LOCKS, &ls->flags) &&
- !test_bit(LFL_NOBLOCK, &lp->flags) &&
- lp->req != DLM_LOCK_NL) {
-
- lp->cur = lp->req;
- lp->prev_req = lp->req;
- lp->req = DLM_LOCK_NL;
- lp->lkf |= DLM_LKF_CONVERT;
- lp->lkf &= ~DLM_LKF_CONVDEADLK;
-
- log_debug("rereq %x,%llx id %x %d,%d",
- lp->lockname.ln_type,
- (unsigned long long)lp->lockname.ln_number,
- lp->lksb.sb_lkid, lp->cur, lp->req);
-
- set_bit(LFL_REREQUEST, &lp->flags);
- queue_submit(lp);
- return;
- }
-
- /*
- * DLM demoted the lock to NL before it was granted so GFS must be
- * told it cannot cache data for this lock.
- */
-
- if (lp->lksb.sb_flags & DLM_SBF_DEMOTED)
- set_bit(LFL_NOCACHE, &lp->flags);
-
-out:
- /*
- * This is an internal lock_dlm lock
- */
-
- if (test_bit(LFL_INLOCK, &lp->flags)) {
- clear_bit(LFL_NOBLOCK, &lp->flags);
- lp->cur = lp->req;
- wake_up_ast(lp);
- return;
- }
-
- /*
- * Normal completion of a lock request. Tell GFS it now has the lock.
- */
-
- clear_bit(LFL_NOBLOCK, &lp->flags);
- lp->cur = lp->req;
-
- acb.lc_name = lp->lockname;
- acb.lc_ret |= gdlm_make_lmstate(lp->cur);
-
- if (!test_and_clear_bit(LFL_NOCACHE, &lp->flags) &&
- (lp->cur > DLM_LOCK_NL) && (prev_mode > DLM_LOCK_NL))
- acb.lc_ret |= LM_OUT_CACHEABLE;
-
- ls->fscb(ls->sdp, LM_CB_ASYNC, &acb);
-}
-
-static inline int no_work(struct gdlm_ls *ls, int blocking)
+static inline int no_work(struct gdlm_ls *ls)
{
int ret;
spin_lock(&ls->async_lock);
- ret = list_empty(&ls->complete) && list_empty(&ls->submit);
- if (ret && blocking)
- ret = list_empty(&ls->blocking);
+ ret = list_empty(&ls->submit);
spin_unlock(&ls->async_lock);
return ret;
@@ -276,100 +33,55 @@ static inline int check_drop(struct gdlm_ls *ls)
return 0;
}
-static int gdlm_thread(void *data, int blist)
+static int gdlm_thread(void *data)
{
struct gdlm_ls *ls = (struct gdlm_ls *) data;
struct gdlm_lock *lp = NULL;
- uint8_t complete, blocking, submit, drop;
-
- /* Only thread1 is allowed to do blocking callbacks since gfs
- may wait for a completion callback within a blocking cb. */
while (!kthread_should_stop()) {
wait_event_interruptible(ls->thread_wait,
- !no_work(ls, blist) || kthread_should_stop());
-
- complete = blocking = submit = drop = 0;
+ !no_work(ls) || kthread_should_stop());
spin_lock(&ls->async_lock);
- if (blist && !list_empty(&ls->blocking)) {
- lp = list_entry(ls->blocking.next, struct gdlm_lock,
- blist);
- list_del_init(&lp->blist);
- blocking = lp->bast_mode;
- lp->bast_mode = 0;
- } else if (!list_empty(&ls->complete)) {
- lp = list_entry(ls->complete.next, struct gdlm_lock,
- clist);
- list_del_init(&lp->clist);
- complete = 1;
- } else if (!list_empty(&ls->submit)) {
+ if (!list_empty(&ls->submit)) {
lp = list_entry(ls->submit.next, struct gdlm_lock,
delay_list);
list_del_init(&lp->delay_list);
- submit = 1;
- }
-
- drop = check_drop(ls);
- spin_unlock(&ls->async_lock);
-
- if (complete)
- process_complete(lp);
-
- else if (blocking)
- process_blocking(lp, blocking);
-
- else if (submit)
+ spin_unlock(&ls->async_lock);
gdlm_do_lock(lp);
-
- if (drop)
+ spin_lock(&ls->async_lock);
+ }
+ /* Does this ever happen these days? I hope not anyway */
+ if (check_drop(ls)) {
+ spin_unlock(&ls->async_lock);
ls->fscb(ls->sdp, LM_CB_DROPLOCKS, NULL);
-
- schedule();
+ spin_lock(&ls->async_lock);
+ }
+ spin_unlock(&ls->async_lock);
}
return 0;
}
-static int gdlm_thread1(void *data)
-{
- return gdlm_thread(data, 1);
-}
-
-static int gdlm_thread2(void *data)
-{
- return gdlm_thread(data, 0);
-}
-
int gdlm_init_threads(struct gdlm_ls *ls)
{
struct task_struct *p;
int error;
- p = kthread_run(gdlm_thread1, ls, "lock_dlm1");
- error = IS_ERR(p);
- if (error) {
- log_error("can't start lock_dlm1 thread %d", error);
- return error;
- }
- ls->thread1 = p;
-
- p = kthread_run(gdlm_thread2, ls, "lock_dlm2");
+ p = kthread_run(gdlm_thread, ls, "lock_dlm");
error = IS_ERR(p);
if (error) {
- log_error("can't start lock_dlm2 thread %d", error);
- kthread_stop(ls->thread1);
+ log_error("can't start lock_dlm thread %d", error);
return error;
}
- ls->thread2 = p;
+ ls->thread = p;
return 0;
}
void gdlm_release_threads(struct gdlm_ls *ls)
{
- kthread_stop(ls->thread1);
- kthread_stop(ls->thread2);
+ kthread_stop(ls->thread);
}
--
1.5.1.2
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [Cluster-devel] [PATCH 03/18] [GFS2] No lock_nolock
2008-07-11 10:11 ` [Cluster-devel] [PATCH 02/18] [GFS2] Fix ordering bug in lock_dlm swhiteho
@ 2008-07-11 10:11 ` swhiteho
2008-07-11 10:11 ` [Cluster-devel] [PATCH 04/18] [GFS2] trivial sparse lock annotations swhiteho
0 siblings, 1 reply; 35+ messages in thread
From: swhiteho @ 2008-07-11 10:11 UTC (permalink / raw)
To: cluster-devel.redhat.com
From: Steven Whitehouse <swhiteho@redhat.com>
This patch merges the lock_nolock module into GFS2 itself. As well as removing
some of the overhead of the module, it also means that its now impossible to
build GFS2 without a lock module (which would be a pointless thing to do
anyway).
We also plan to merge lock_dlm into GFS2 in the future, but that is a more
tricky task, and will therefore be a separate patch.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: David Teigland <teigland@redhat.com>
diff --git a/fs/gfs2/Kconfig b/fs/gfs2/Kconfig
index 7f7947e..ab2f57e 100644
--- a/fs/gfs2/Kconfig
+++ b/fs/gfs2/Kconfig
@@ -14,23 +14,11 @@ config GFS2_FS
GFS is perfect consistency -- changes made to the filesystem on one
machine show up immediately on all other machines in the cluster.
- To use the GFS2 filesystem, you will need to enable one or more of
- the below locking modules. Documentation and utilities for GFS2 can
+ To use the GFS2 filesystem in a cluster, you will need to enable
+ the locking module below. Documentation and utilities for GFS2 can
be found here: http://sources.redhat.com/cluster
-config GFS2_FS_LOCKING_NOLOCK
- tristate "GFS2 \"nolock\" locking module"
- depends on GFS2_FS
- help
- Single node locking module for GFS2.
-
- Use this module if you want to use GFS2 on a single node without
- its clustering features. You can still take advantage of the
- large file support, and upgrade to running a full cluster later on
- if required.
-
- If you will only be using GFS2 in cluster mode, you do not need this
- module.
+ The "nolock" lock module is now built in to GFS2 by default.
config GFS2_FS_LOCKING_DLM
tristate "GFS2 DLM locking module"
diff --git a/fs/gfs2/Makefile b/fs/gfs2/Makefile
index e2350df..ec65851 100644
--- a/fs/gfs2/Makefile
+++ b/fs/gfs2/Makefile
@@ -5,6 +5,5 @@ gfs2-y := acl.o bmap.o daemon.o dir.o eaops.o eattr.o glock.o \
ops_fstype.o ops_inode.o ops_super.o quota.o \
recovery.o rgrp.o super.o sys.o trans.o util.o
-obj-$(CONFIG_GFS2_FS_LOCKING_NOLOCK) += locking/nolock/
obj-$(CONFIG_GFS2_FS_LOCKING_DLM) += locking/dlm/
diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index 519a54c..be7ed50 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -153,7 +153,7 @@ static void glock_free(struct gfs2_glock *gl)
struct gfs2_sbd *sdp = gl->gl_sbd;
struct inode *aspace = gl->gl_aspace;
- if (likely(!test_bit(SDF_SHUTDOWN, &sdp->sd_flags)))
+ if (sdp->sd_lockstruct.ls_ops->lm_put_lock)
sdp->sd_lockstruct.ls_ops->lm_put_lock(gl->gl_lock);
if (aspace)
@@ -488,6 +488,10 @@ static unsigned int gfs2_lm_lock(struct gfs2_sbd *sdp, void *lock,
unsigned int flags)
{
int ret = LM_OUT_ERROR;
+
+ if (!sdp->sd_lockstruct.ls_ops->lm_lock)
+ return req_state == LM_ST_UNLOCKED ? 0 : req_state;
+
if (likely(!test_bit(SDF_SHUTDOWN, &sdp->sd_flags)))
ret = sdp->sd_lockstruct.ls_ops->lm_lock(lock, cur_state,
req_state, flags);
@@ -631,6 +635,8 @@ static int gfs2_lm_get_lock(struct gfs2_sbd *sdp, struct lm_lockname *name,
void **lockp)
{
int error = -EIO;
+ if (!sdp->sd_lockstruct.ls_ops->lm_get_lock)
+ return 0;
if (likely(!test_bit(SDF_SHUTDOWN, &sdp->sd_flags)))
error = sdp->sd_lockstruct.ls_ops->lm_get_lock(
sdp->sd_lockstruct.ls_lockspace, name, lockp);
@@ -910,7 +916,8 @@ do_cancel:
gh = list_entry(gl->gl_holders.next, struct gfs2_holder, gh_list);
if (!(gh->gh_flags & LM_FLAG_PRIORITY)) {
spin_unlock(&gl->gl_spin);
- sdp->sd_lockstruct.ls_ops->lm_cancel(gl->gl_lock);
+ if (sdp->sd_lockstruct.ls_ops->lm_cancel)
+ sdp->sd_lockstruct.ls_ops->lm_cancel(gl->gl_lock);
spin_lock(&gl->gl_spin);
}
return;
@@ -1187,6 +1194,8 @@ void gfs2_glock_dq_uninit_m(unsigned int num_gh, struct gfs2_holder *ghs)
static int gfs2_lm_hold_lvb(struct gfs2_sbd *sdp, void *lock, char **lvbp)
{
int error = -EIO;
+ if (!sdp->sd_lockstruct.ls_ops->lm_hold_lvb)
+ return 0;
if (likely(!test_bit(SDF_SHUTDOWN, &sdp->sd_flags)))
error = sdp->sd_lockstruct.ls_ops->lm_hold_lvb(lock, lvbp);
return error;
@@ -1226,7 +1235,7 @@ void gfs2_lvb_unhold(struct gfs2_glock *gl)
gfs2_glock_hold(gl);
gfs2_assert(gl->gl_sbd, atomic_read(&gl->gl_lvb_count) > 0);
if (atomic_dec_and_test(&gl->gl_lvb_count)) {
- if (likely(!test_bit(SDF_SHUTDOWN, &sdp->sd_flags)))
+ if (sdp->sd_lockstruct.ls_ops->lm_unhold_lvb)
sdp->sd_lockstruct.ls_ops->lm_unhold_lvb(gl->gl_lock, gl->gl_lvb);
gl->gl_lvb = NULL;
gfs2_glock_put(gl);
diff --git a/fs/gfs2/locking.c b/fs/gfs2/locking.c
index 663fee7..a4a367a 100644
--- a/fs/gfs2/locking.c
+++ b/fs/gfs2/locking.c
@@ -23,12 +23,54 @@ struct lmh_wrapper {
const struct lm_lockops *lw_ops;
};
+static int nolock_mount(char *table_name, char *host_data,
+ lm_callback_t cb, void *cb_data,
+ unsigned int min_lvb_size, int flags,
+ struct lm_lockstruct *lockstruct,
+ struct kobject *fskobj);
+
/* List of registered low-level locking protocols. A file system selects one
of them by name at mount time, e.g. lock_nolock, lock_dlm. */
+static const struct lm_lockops nolock_ops = {
+ .lm_proto_name = "lock_nolock",
+ .lm_mount = nolock_mount,
+};
+
+static struct lmh_wrapper nolock_proto = {
+ .lw_list = LIST_HEAD_INIT(nolock_proto.lw_list),
+ .lw_ops = &nolock_ops,
+};
+
static LIST_HEAD(lmh_list);
static DEFINE_MUTEX(lmh_lock);
+static int nolock_mount(char *table_name, char *host_data,
+ lm_callback_t cb, void *cb_data,
+ unsigned int min_lvb_size, int flags,
+ struct lm_lockstruct *lockstruct,
+ struct kobject *fskobj)
+{
+ char *c;
+ unsigned int jid;
+
+ c = strstr(host_data, "jid=");
+ if (!c)
+ jid = 0;
+ else {
+ c += 4;
+ sscanf(c, "%u", &jid);
+ }
+
+ lockstruct->ls_jid = jid;
+ lockstruct->ls_first = 1;
+ lockstruct->ls_lvb_size = min_lvb_size;
+ lockstruct->ls_ops = &nolock_ops;
+ lockstruct->ls_flags = LM_LSFLAG_LOCAL;
+
+ return 0;
+}
+
/**
* gfs2_register_lockproto - Register a low-level locking protocol
* @proto: the protocol definition
@@ -116,9 +158,13 @@ int gfs2_mount_lockproto(char *proto_name, char *table_name, char *host_data,
int try = 0;
int error, found;
+
retry:
mutex_lock(&lmh_lock);
+ if (list_empty(&nolock_proto.lw_list))
+ list_add(&lmh_list, &nolock_proto.lw_list);
+
found = 0;
list_for_each_entry(lw, &lmh_list, lw_list) {
if (!strcmp(lw->lw_ops->lm_proto_name, proto_name)) {
@@ -139,7 +185,8 @@ retry:
goto out;
}
- if (!try_module_get(lw->lw_ops->lm_owner)) {
+ if (lw->lw_ops->lm_owner &&
+ !try_module_get(lw->lw_ops->lm_owner)) {
try = 0;
mutex_unlock(&lmh_lock);
msleep(1000);
@@ -158,7 +205,8 @@ out:
void gfs2_unmount_lockproto(struct lm_lockstruct *lockstruct)
{
mutex_lock(&lmh_lock);
- lockstruct->ls_ops->lm_unmount(lockstruct->ls_lockspace);
+ if (lockstruct->ls_ops->lm_unmount)
+ lockstruct->ls_ops->lm_unmount(lockstruct->ls_lockspace);
if (lockstruct->ls_ops->lm_owner)
module_put(lockstruct->ls_ops->lm_owner);
mutex_unlock(&lmh_lock);
diff --git a/fs/gfs2/locking/nolock/Makefile b/fs/gfs2/locking/nolock/Makefile
deleted file mode 100644
index 35e9730..0000000
--- a/fs/gfs2/locking/nolock/Makefile
+++ /dev/null
@@ -1,3 +0,0 @@
-obj-$(CONFIG_GFS2_FS_LOCKING_NOLOCK) += lock_nolock.o
-lock_nolock-y := main.o
-
diff --git a/fs/gfs2/locking/nolock/main.c b/fs/gfs2/locking/nolock/main.c
deleted file mode 100644
index 627bfb7..0000000
--- a/fs/gfs2/locking/nolock/main.c
+++ /dev/null
@@ -1,240 +0,0 @@
-/*
- * Copyright (C) Sistina Software, Inc. 1997-2003 All rights reserved.
- * Copyright (C) 2004-2005 Red Hat, Inc. All rights reserved.
- *
- * This copyrighted material is made available to anyone wishing to use,
- * modify, copy, or redistribute it subject to the terms and conditions
- * of the GNU General Public License version 2.
- */
-
-#include <linux/module.h>
-#include <linux/slab.h>
-#include <linux/init.h>
-#include <linux/types.h>
-#include <linux/fs.h>
-#include <linux/lm_interface.h>
-
-struct nolock_lockspace {
- unsigned int nl_lvb_size;
-};
-
-static const struct lm_lockops nolock_ops;
-
-static int nolock_mount(char *table_name, char *host_data,
- lm_callback_t cb, void *cb_data,
- unsigned int min_lvb_size, int flags,
- struct lm_lockstruct *lockstruct,
- struct kobject *fskobj)
-{
- char *c;
- unsigned int jid;
- struct nolock_lockspace *nl;
-
- c = strstr(host_data, "jid=");
- if (!c)
- jid = 0;
- else {
- c += 4;
- sscanf(c, "%u", &jid);
- }
-
- nl = kzalloc(sizeof(struct nolock_lockspace), GFP_KERNEL);
- if (!nl)
- return -ENOMEM;
-
- nl->nl_lvb_size = min_lvb_size;
-
- lockstruct->ls_jid = jid;
- lockstruct->ls_first = 1;
- lockstruct->ls_lvb_size = min_lvb_size;
- lockstruct->ls_lockspace = nl;
- lockstruct->ls_ops = &nolock_ops;
- lockstruct->ls_flags = LM_LSFLAG_LOCAL;
-
- return 0;
-}
-
-static void nolock_others_may_mount(void *lockspace)
-{
-}
-
-static void nolock_unmount(void *lockspace)
-{
- struct nolock_lockspace *nl = lockspace;
- kfree(nl);
-}
-
-static void nolock_withdraw(void *lockspace)
-{
-}
-
-/**
- * nolock_get_lock - get a lm_lock_t given a descripton of the lock
- * @lockspace: the lockspace the lock lives in
- * @name: the name of the lock
- * @lockp: return the lm_lock_t here
- *
- * Returns: 0 on success, -EXXX on failure
- */
-
-static int nolock_get_lock(void *lockspace, struct lm_lockname *name,
- void **lockp)
-{
- *lockp = lockspace;
- return 0;
-}
-
-/**
- * nolock_put_lock - get rid of a lock structure
- * @lock: the lock to throw away
- *
- */
-
-static void nolock_put_lock(void *lock)
-{
-}
-
-/**
- * nolock_lock - acquire a lock
- * @lock: the lock to manipulate
- * @cur_state: the current state
- * @req_state: the requested state
- * @flags: modifier flags
- *
- * Returns: A bitmap of LM_OUT_*
- */
-
-static unsigned int nolock_lock(void *lock, unsigned int cur_state,
- unsigned int req_state, unsigned int flags)
-{
- if (req_state == LM_ST_UNLOCKED)
- return 0;
- return req_state | LM_OUT_CACHEABLE;
-}
-
-/**
- * nolock_unlock - unlock a lock
- * @lock: the lock to manipulate
- * @cur_state: the current state
- *
- * Returns: 0
- */
-
-static unsigned int nolock_unlock(void *lock, unsigned int cur_state)
-{
- return 0;
-}
-
-static void nolock_cancel(void *lock)
-{
-}
-
-/**
- * nolock_hold_lvb - hold on to a lock value block
- * @lock: the lock the LVB is associated with
- * @lvbp: return the lm_lvb_t here
- *
- * Returns: 0 on success, -EXXX on failure
- */
-
-static int nolock_hold_lvb(void *lock, char **lvbp)
-{
- struct nolock_lockspace *nl = lock;
- int error = 0;
-
- *lvbp = kzalloc(nl->nl_lvb_size, GFP_NOFS);
- if (!*lvbp)
- error = -ENOMEM;
-
- return error;
-}
-
-/**
- * nolock_unhold_lvb - release a LVB
- * @lock: the lock the LVB is associated with
- * @lvb: the lock value block
- *
- */
-
-static void nolock_unhold_lvb(void *lock, char *lvb)
-{
- kfree(lvb);
-}
-
-static int nolock_plock_get(void *lockspace, struct lm_lockname *name,
- struct file *file, struct file_lock *fl)
-{
- posix_test_lock(file, fl);
-
- return 0;
-}
-
-static int nolock_plock(void *lockspace, struct lm_lockname *name,
- struct file *file, int cmd, struct file_lock *fl)
-{
- int error;
- error = posix_lock_file_wait(file, fl);
- return error;
-}
-
-static int nolock_punlock(void *lockspace, struct lm_lockname *name,
- struct file *file, struct file_lock *fl)
-{
- int error;
- error = posix_lock_file_wait(file, fl);
- return error;
-}
-
-static void nolock_recovery_done(void *lockspace, unsigned int jid,
- unsigned int message)
-{
-}
-
-static const struct lm_lockops nolock_ops = {
- .lm_proto_name = "lock_nolock",
- .lm_mount = nolock_mount,
- .lm_others_may_mount = nolock_others_may_mount,
- .lm_unmount = nolock_unmount,
- .lm_withdraw = nolock_withdraw,
- .lm_get_lock = nolock_get_lock,
- .lm_put_lock = nolock_put_lock,
- .lm_lock = nolock_lock,
- .lm_unlock = nolock_unlock,
- .lm_cancel = nolock_cancel,
- .lm_hold_lvb = nolock_hold_lvb,
- .lm_unhold_lvb = nolock_unhold_lvb,
- .lm_plock_get = nolock_plock_get,
- .lm_plock = nolock_plock,
- .lm_punlock = nolock_punlock,
- .lm_recovery_done = nolock_recovery_done,
- .lm_owner = THIS_MODULE,
-};
-
-static int __init init_nolock(void)
-{
- int error;
-
- error = gfs2_register_lockproto(&nolock_ops);
- if (error) {
- printk(KERN_WARNING
- "lock_nolock: can't register protocol: %d\n", error);
- return error;
- }
-
- printk(KERN_INFO
- "Lock_Nolock (built %s %s) installed\n", __DATE__, __TIME__);
- return 0;
-}
-
-static void __exit exit_nolock(void)
-{
- gfs2_unregister_lockproto(&nolock_ops);
-}
-
-module_init(init_nolock);
-module_exit(exit_nolock);
-
-MODULE_DESCRIPTION("GFS Nolock Locking Module");
-MODULE_AUTHOR("Red Hat, Inc.");
-MODULE_LICENSE("GPL");
-
diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index b2028c8..9bd97c5 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -364,6 +364,8 @@ static int map_journal_extents(struct gfs2_sbd *sdp)
static void gfs2_lm_others_may_mount(struct gfs2_sbd *sdp)
{
+ if (!sdp->sd_lockstruct.ls_ops->lm_others_may_mount)
+ return;
if (likely(!test_bit(SDF_SHUTDOWN, &sdp->sd_flags)))
sdp->sd_lockstruct.ls_ops->lm_others_may_mount(
sdp->sd_lockstruct.ls_lockspace);
@@ -741,8 +743,7 @@ static int gfs2_lm_mount(struct gfs2_sbd *sdp, int silent)
goto out;
}
- if (gfs2_assert_warn(sdp, sdp->sd_lockstruct.ls_lockspace) ||
- gfs2_assert_warn(sdp, sdp->sd_lockstruct.ls_ops) ||
+ if (gfs2_assert_warn(sdp, sdp->sd_lockstruct.ls_ops) ||
gfs2_assert_warn(sdp, sdp->sd_lockstruct.ls_lvb_size >=
GFS2_MIN_LVB_SIZE)) {
gfs2_unmount_lockproto(&sdp->sd_lockstruct);
diff --git a/fs/gfs2/recovery.c b/fs/gfs2/recovery.c
index fdd3f0f..d5e91f4 100644
--- a/fs/gfs2/recovery.c
+++ b/fs/gfs2/recovery.c
@@ -428,6 +428,9 @@ static int clean_journal(struct gfs2_jdesc *jd, struct gfs2_log_header_host *hea
static void gfs2_lm_recovery_done(struct gfs2_sbd *sdp, unsigned int jid,
unsigned int message)
{
+ if (!sdp->sd_lockstruct.ls_ops->lm_recovery_done)
+ return;
+
if (likely(!test_bit(SDF_SHUTDOWN, &sdp->sd_flags)))
sdp->sd_lockstruct.ls_ops->lm_recovery_done(
sdp->sd_lockstruct.ls_lockspace, jid, message);
--
1.5.1.2
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [Cluster-devel] [PATCH 04/18] [GFS2] trivial sparse lock annotations
2008-07-11 10:11 ` [Cluster-devel] [PATCH 03/18] [GFS2] No lock_nolock swhiteho
@ 2008-07-11 10:11 ` swhiteho
2008-07-11 10:11 ` [Cluster-devel] [PATCH 05/18] [GFS2] Fix ordering of args for list_add swhiteho
0 siblings, 1 reply; 35+ messages in thread
From: swhiteho @ 2008-07-11 10:11 UTC (permalink / raw)
To: cluster-devel.redhat.com
From: Harvey Harrison <harvey.harrison@gmail.com>
Annotate the &sdp->sd_log_lock.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index 548264b..6c6af9f 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -87,6 +87,8 @@ void gfs2_remove_from_ail(struct gfs2_bufdata *bd)
*/
static void gfs2_ail1_start_one(struct gfs2_sbd *sdp, struct gfs2_ail *ai)
+__releases(&sdp->sd_log_lock)
+__acquires(&sdp->sd_log_lock)
{
struct gfs2_bufdata *bd, *s;
struct buffer_head *bh;
diff --git a/fs/gfs2/log.h b/fs/gfs2/log.h
index 7711528..7c64510 100644
--- a/fs/gfs2/log.h
+++ b/fs/gfs2/log.h
@@ -21,6 +21,7 @@
*/
static inline void gfs2_log_lock(struct gfs2_sbd *sdp)
+__acquires(&sdp->sd_log_lock)
{
spin_lock(&sdp->sd_log_lock);
}
@@ -32,6 +33,7 @@ static inline void gfs2_log_lock(struct gfs2_sbd *sdp)
*/
static inline void gfs2_log_unlock(struct gfs2_sbd *sdp)
+__releases(&sdp->sd_log_lock)
{
spin_unlock(&sdp->sd_log_lock);
}
--
1.5.1.2
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [Cluster-devel] [PATCH 05/18] [GFS2] Fix ordering of args for list_add
2008-07-11 10:11 ` [Cluster-devel] [PATCH 04/18] [GFS2] trivial sparse lock annotations swhiteho
@ 2008-07-11 10:11 ` swhiteho
2008-07-11 10:11 ` [Cluster-devel] [PATCH 06/18] [GFS2] Revise readpage locking swhiteho
0 siblings, 1 reply; 35+ messages in thread
From: swhiteho @ 2008-07-11 10:11 UTC (permalink / raw)
To: cluster-devel.redhat.com
From: Steven Whitehouse <swhiteho@redhat.com>
The patch to remove lock_nolock managed to get the arguments
of this list_add backwards. This fixes it.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
diff --git a/fs/gfs2/locking.c b/fs/gfs2/locking.c
index a4a367a..523243a 100644
--- a/fs/gfs2/locking.c
+++ b/fs/gfs2/locking.c
@@ -163,7 +163,7 @@ retry:
mutex_lock(&lmh_lock);
if (list_empty(&nolock_proto.lw_list))
- list_add(&lmh_list, &nolock_proto.lw_list);
+ list_add(&nolock_proto.lw_list, &lmh_list);
found = 0;
list_for_each_entry(lw, &lmh_list, lw_list) {
--
1.5.1.2
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [Cluster-devel] [PATCH 06/18] [GFS2] Revise readpage locking
2008-07-11 10:11 ` [Cluster-devel] [PATCH 05/18] [GFS2] Fix ordering of args for list_add swhiteho
@ 2008-07-11 10:11 ` swhiteho
2008-07-11 10:11 ` [Cluster-devel] [PATCH 07/18] [GFS2] kernel panic mounting volume swhiteho
0 siblings, 1 reply; 35+ messages in thread
From: swhiteho @ 2008-07-11 10:11 UTC (permalink / raw)
To: cluster-devel.redhat.com
From: Steven Whitehouse <swhiteho@redhat.com>
The previous attempt to fix the locking in readpage failed due
to the use of a "try lock" which resulted in occasional high
cpu usage during testing (due to repeated tries) and also it
did not resolve all the ordering problems wrt the transaction
lock (although it did solve all the inode lock ordering problems).
This patch avoids the problem by unlocking the page and getting the
locks in the correct order. This means that we have to retest the
page to ensure that it hasn't changed when we relock the page.
This now passes the tests which were previously failing.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
diff --git a/fs/gfs2/ops_address.c b/fs/gfs2/ops_address.c
index 2b556dd..e64a1b0 100644
--- a/fs/gfs2/ops_address.c
+++ b/fs/gfs2/ops_address.c
@@ -499,31 +499,34 @@ static int __gfs2_readpage(void *file, struct page *page)
* @file: The file to read
* @page: The page of the file
*
- * This deals with the locking required. We use a trylock in order to
- * avoid the page lock / glock ordering problems returning AOP_TRUNCATED_PAGE
- * in the event that we are unable to get the lock.
+ * This deals with the locking required. We have to unlock and
+ * relock the page in order to get the locking in the right
+ * order.
*/
static int gfs2_readpage(struct file *file, struct page *page)
{
- struct gfs2_inode *ip = GFS2_I(page->mapping->host);
+ struct address_space *mapping = page->mapping;
+ struct gfs2_inode *ip = GFS2_I(mapping->host);
struct gfs2_holder gh;
int error;
- gfs2_holder_init(ip->i_gl, LM_ST_SHARED, GL_ATIME|LM_FLAG_TRY_1CB, &gh);
+ unlock_page(page);
+ gfs2_holder_init(ip->i_gl, LM_ST_SHARED, GL_ATIME, &gh);
error = gfs2_glock_nq_atime(&gh);
- if (unlikely(error)) {
- unlock_page(page);
+ if (unlikely(error))
goto out;
- }
- error = __gfs2_readpage(file, page);
+ error = AOP_TRUNCATED_PAGE;
+ lock_page(page);
+ if (page->mapping == mapping && !PageUptodate(page))
+ error = __gfs2_readpage(file, page);
+ else
+ unlock_page(page);
gfs2_glock_dq(&gh);
out:
gfs2_holder_uninit(&gh);
- if (error == GLR_TRYFAILED) {
- yield();
- return AOP_TRUNCATED_PAGE;
- }
+ if (error && error != AOP_TRUNCATED_PAGE)
+ lock_page(page);
return error;
}
--
1.5.1.2
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [Cluster-devel] [PATCH 07/18] [GFS2] kernel panic mounting volume
2008-07-11 10:11 ` [Cluster-devel] [PATCH 06/18] [GFS2] Revise readpage locking swhiteho
@ 2008-07-11 10:11 ` swhiteho
2008-07-11 10:11 ` [Cluster-devel] [PATCH 08/18] [GFS2] Remove remote lock dropping code swhiteho
0 siblings, 1 reply; 35+ messages in thread
From: swhiteho @ 2008-07-11 10:11 UTC (permalink / raw)
To: cluster-devel.redhat.com
From: Bob Peterson <rpeterso@redhat.com>
This patch fixes Red Hat bugzilla bug 450156.
This started with a not-too-improbable mount failure because the
locking protocol was never set back to its proper "lock_dlm" after the
system was rebooted in the middle of a gfs2_fsck. That left a
(purposely) invalid locking protocol in the superblock, which caused an
error when the file system was mounted the next time.
When there's an error mounting, vfs calls DQUOT_OFF, which calls
vfs_quota_off which calls gfs2_sync_fs. Next, gfs2_sync_fs calls
gfs2_log_flush passing s_fs_info. But due to the error, s_fs_info
had been previously set to NULL, and so we have the kernel oops.
My solution in this patch is to test for the NULL value before passing
it. I tested this patch and it fixes the problem.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
diff --git a/fs/gfs2/ops_super.c b/fs/gfs2/ops_super.c
index 0b7cc92..6690792 100644
--- a/fs/gfs2/ops_super.c
+++ b/fs/gfs2/ops_super.c
@@ -155,7 +155,7 @@ static void gfs2_write_super(struct super_block *sb)
static int gfs2_sync_fs(struct super_block *sb, int wait)
{
sb->s_dirt = 0;
- if (wait)
+ if (wait && sb->s_fs_info)
gfs2_log_flush(sb->s_fs_info, NULL);
return 0;
}
--
1.5.1.2
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [Cluster-devel] [PATCH 08/18] [GFS2] Remove remote lock dropping code
2008-07-11 10:11 ` [Cluster-devel] [PATCH 07/18] [GFS2] kernel panic mounting volume swhiteho
@ 2008-07-11 10:11 ` swhiteho
2008-07-11 10:11 ` [Cluster-devel] [PATCH 09/18] [GFS2] Remove obsolete conversion deadlock avoidance code swhiteho
0 siblings, 1 reply; 35+ messages in thread
From: swhiteho @ 2008-07-11 10:11 UTC (permalink / raw)
To: cluster-devel.redhat.com
From: Steven Whitehouse <swhiteho@redhat.com>
There are several reasons why this is undesirable:
1. It never happens during normal operation anyway
2. If it does happen it causes performance to be very, very poor
3. It isn't likely to solve the original problem (memory shortage
on remote DLM node) it was supposed to solve
4. It uses a bunch of arbitrary constants which are unlikely to be
correct for any particular situation and for which the tuning seems
to be a black art.
5. In an N node cluster, only 1/N of the dropped locked will actually
contribute to solving the problem on average.
So all in all we are better off without it. This also makes merging
the lock_dlm module into GFS2 a bit easier.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
diff --git a/fs/gfs2/gfs2.h b/fs/gfs2/gfs2.h
index 3bb11c0..ef606e3 100644
--- a/fs/gfs2/gfs2.h
+++ b/fs/gfs2/gfs2.h
@@ -16,11 +16,6 @@ enum {
};
enum {
- NO_WAIT = 0,
- WAIT = 1,
-};
-
-enum {
NO_FORCE = 0,
FORCE = 1,
};
diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index be7ed50..8d5450f 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -1316,11 +1316,6 @@ void gfs2_glock_cb(void *cb_data, unsigned int type, void *data)
wake_up_process(sdp->sd_recoverd_process);
return;
- case LM_CB_DROPLOCKS:
- gfs2_gl_hash_clear(sdp, NO_WAIT);
- gfs2_quota_scan(sdp);
- return;
-
default:
gfs2_assert_warn(sdp, 0);
return;
@@ -1508,11 +1503,10 @@ static void clear_glock(struct gfs2_glock *gl)
* @sdp: the filesystem
* @wait: wait until it's all gone
*
- * Called when unmounting the filesystem, or when inter-node lock manager
- * requests DROPLOCKS because it is running out of capacity.
+ * Called when unmounting the filesystem.
*/
-void gfs2_gl_hash_clear(struct gfs2_sbd *sdp, int wait)
+void gfs2_gl_hash_clear(struct gfs2_sbd *sdp)
{
unsigned long t;
unsigned int x;
@@ -1527,7 +1521,7 @@ void gfs2_gl_hash_clear(struct gfs2_sbd *sdp, int wait)
cont = 1;
}
- if (!wait || !cont)
+ if (!cont)
break;
if (time_after_eq(jiffies,
diff --git a/fs/gfs2/glock.h b/fs/gfs2/glock.h
index 7389f8e..971d92a 100644
--- a/fs/gfs2/glock.h
+++ b/fs/gfs2/glock.h
@@ -132,7 +132,7 @@ void gfs2_lvb_unhold(struct gfs2_glock *gl);
void gfs2_glock_cb(void *cb_data, unsigned int type, void *data);
void gfs2_glock_schedule_for_reclaim(struct gfs2_glock *gl);
void gfs2_reclaim_glock(struct gfs2_sbd *sdp);
-void gfs2_gl_hash_clear(struct gfs2_sbd *sdp, int wait);
+void gfs2_gl_hash_clear(struct gfs2_sbd *sdp);
int __init gfs2_glock_init(void);
void gfs2_glock_exit(void);
diff --git a/fs/gfs2/locking/dlm/lock_dlm.h b/fs/gfs2/locking/dlm/lock_dlm.h
index ad944c6..845a27f 100644
--- a/fs/gfs2/locking/dlm/lock_dlm.h
+++ b/fs/gfs2/locking/dlm/lock_dlm.h
@@ -79,9 +79,6 @@ struct gdlm_ls {
wait_queue_head_t wait_control;
struct task_struct *thread;
wait_queue_head_t thread_wait;
- unsigned long drop_time;
- int drop_locks_count;
- int drop_locks_period;
};
enum {
diff --git a/fs/gfs2/locking/dlm/mount.c b/fs/gfs2/locking/dlm/mount.c
index 0628520..fa31c54 100644
--- a/fs/gfs2/locking/dlm/mount.c
+++ b/fs/gfs2/locking/dlm/mount.c
@@ -22,8 +22,6 @@ static struct gdlm_ls *init_gdlm(lm_callback_t cb, struct gfs2_sbd *sdp,
if (!ls)
return NULL;
- ls->drop_locks_count = GDLM_DROP_COUNT;
- ls->drop_locks_period = GDLM_DROP_PERIOD;
ls->fscb = cb;
ls->sdp = sdp;
ls->fsflags = flags;
@@ -33,7 +31,6 @@ static struct gdlm_ls *init_gdlm(lm_callback_t cb, struct gfs2_sbd *sdp,
INIT_LIST_HEAD(&ls->all_locks);
init_waitqueue_head(&ls->thread_wait);
init_waitqueue_head(&ls->wait_control);
- ls->drop_time = jiffies;
ls->jid = -1;
strncpy(buf, table_name, 256);
diff --git a/fs/gfs2/locking/dlm/sysfs.c b/fs/gfs2/locking/dlm/sysfs.c
index a4ff271..4ec571c 100644
--- a/fs/gfs2/locking/dlm/sysfs.c
+++ b/fs/gfs2/locking/dlm/sysfs.c
@@ -114,17 +114,6 @@ static ssize_t recover_status_show(struct gdlm_ls *ls, char *buf)
return sprintf(buf, "%d\n", ls->recover_jid_status);
}
-static ssize_t drop_count_show(struct gdlm_ls *ls, char *buf)
-{
- return sprintf(buf, "%d\n", ls->drop_locks_count);
-}
-
-static ssize_t drop_count_store(struct gdlm_ls *ls, const char *buf, size_t len)
-{
- ls->drop_locks_count = simple_strtol(buf, NULL, 0);
- return len;
-}
-
struct gdlm_attr {
struct attribute attr;
ssize_t (*show)(struct gdlm_ls *, char *);
@@ -144,7 +133,6 @@ GDLM_ATTR(first_done, 0444, first_done_show, NULL);
GDLM_ATTR(recover, 0644, recover_show, recover_store);
GDLM_ATTR(recover_done, 0444, recover_done_show, NULL);
GDLM_ATTR(recover_status, 0444, recover_status_show, NULL);
-GDLM_ATTR(drop_count, 0644, drop_count_show, drop_count_store);
static struct attribute *gdlm_attrs[] = {
&gdlm_attr_proto_name.attr,
@@ -157,7 +145,6 @@ static struct attribute *gdlm_attrs[] = {
&gdlm_attr_recover.attr,
&gdlm_attr_recover_done.attr,
&gdlm_attr_recover_status.attr,
- &gdlm_attr_drop_count.attr,
NULL,
};
diff --git a/fs/gfs2/locking/dlm/thread.c b/fs/gfs2/locking/dlm/thread.c
index f30350a..38823ef 100644
--- a/fs/gfs2/locking/dlm/thread.c
+++ b/fs/gfs2/locking/dlm/thread.c
@@ -20,19 +20,6 @@ static inline int no_work(struct gdlm_ls *ls)
return ret;
}
-static inline int check_drop(struct gdlm_ls *ls)
-{
- if (!ls->drop_locks_count)
- return 0;
-
- if (time_after(jiffies, ls->drop_time + ls->drop_locks_period * HZ)) {
- ls->drop_time = jiffies;
- if (ls->all_locks_count >= ls->drop_locks_count)
- return 1;
- }
- return 0;
-}
-
static int gdlm_thread(void *data)
{
struct gdlm_ls *ls = (struct gdlm_ls *) data;
@@ -52,12 +39,6 @@ static int gdlm_thread(void *data)
gdlm_do_lock(lp);
spin_lock(&ls->async_lock);
}
- /* Does this ever happen these days? I hope not anyway */
- if (check_drop(ls)) {
- spin_unlock(&ls->async_lock);
- ls->fscb(ls->sdp, LM_CB_DROPLOCKS, NULL);
- spin_lock(&ls->async_lock);
- }
spin_unlock(&ls->async_lock);
}
diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index 9bd97c5..6ba69dd 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -874,7 +874,7 @@ fail_sb:
fail_locking:
init_locking(sdp, &mount_gh, UNDO);
fail_lm:
- gfs2_gl_hash_clear(sdp, WAIT);
+ gfs2_gl_hash_clear(sdp);
gfs2_lm_unmount(sdp);
while (invalidate_inodes(sb))
yield();
diff --git a/fs/gfs2/ops_super.c b/fs/gfs2/ops_super.c
index 6690792..f66ea0f 100644
--- a/fs/gfs2/ops_super.c
+++ b/fs/gfs2/ops_super.c
@@ -126,7 +126,7 @@ static void gfs2_put_super(struct super_block *sb)
gfs2_clear_rgrpd(sdp);
gfs2_jindex_free(sdp);
/* Take apart glock structures and buffer lists */
- gfs2_gl_hash_clear(sdp, WAIT);
+ gfs2_gl_hash_clear(sdp);
/* Unmount the locking protocol */
gfs2_lm_unmount(sdp);
diff --git a/fs/gfs2/sys.c b/fs/gfs2/sys.c
index 9ab9fc8..6f7e2e5 100644
--- a/fs/gfs2/sys.c
+++ b/fs/gfs2/sys.c
@@ -110,18 +110,6 @@ static ssize_t statfs_sync_store(struct gfs2_sbd *sdp, const char *buf,
return len;
}
-static ssize_t shrink_store(struct gfs2_sbd *sdp, const char *buf, size_t len)
-{
- if (!capable(CAP_SYS_ADMIN))
- return -EACCES;
-
- if (simple_strtol(buf, NULL, 0) != 1)
- return -EINVAL;
-
- gfs2_gl_hash_clear(sdp, NO_WAIT);
- return len;
-}
-
static ssize_t quota_sync_store(struct gfs2_sbd *sdp, const char *buf,
size_t len)
{
@@ -175,7 +163,6 @@ static struct gfs2_attr gfs2_attr_##name = __ATTR(name, mode, show, store)
GFS2_ATTR(id, 0444, id_show, NULL);
GFS2_ATTR(fsname, 0444, fsname_show, NULL);
GFS2_ATTR(freeze, 0644, freeze_show, freeze_store);
-GFS2_ATTR(shrink, 0200, NULL, shrink_store);
GFS2_ATTR(withdraw, 0644, withdraw_show, withdraw_store);
GFS2_ATTR(statfs_sync, 0200, NULL, statfs_sync_store);
GFS2_ATTR(quota_sync, 0200, NULL, quota_sync_store);
@@ -186,7 +173,6 @@ static struct attribute *gfs2_attrs[] = {
&gfs2_attr_id.attr,
&gfs2_attr_fsname.attr,
&gfs2_attr_freeze.attr,
- &gfs2_attr_shrink.attr,
&gfs2_attr_withdraw.attr,
&gfs2_attr_statfs_sync.attr,
&gfs2_attr_quota_sync.attr,
diff --git a/include/linux/lm_interface.h b/include/linux/lm_interface.h
index f274997..d0a7112 100644
--- a/include/linux/lm_interface.h
+++ b/include/linux/lm_interface.h
@@ -138,9 +138,6 @@ typedef void (*lm_callback_t) (void *ptr, unsigned int type, void *data);
* LM_CB_NEED_RECOVERY
* The given journal needs to be recovered.
*
- * LM_CB_DROPLOCKS
- * Reduce the number of cached locks.
- *
* LM_CB_ASYNC
* The given lock has been granted.
*/
@@ -149,7 +146,6 @@ typedef void (*lm_callback_t) (void *ptr, unsigned int type, void *data);
#define LM_CB_NEED_D 258
#define LM_CB_NEED_S 259
#define LM_CB_NEED_RECOVERY 260
-#define LM_CB_DROPLOCKS 261
#define LM_CB_ASYNC 262
/*
--
1.5.1.2
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [Cluster-devel] [PATCH 09/18] [GFS2] Remove obsolete conversion deadlock avoidance code
2008-07-11 10:11 ` [Cluster-devel] [PATCH 08/18] [GFS2] Remove remote lock dropping code swhiteho
@ 2008-07-11 10:11 ` swhiteho
2008-07-11 10:11 ` [Cluster-devel] [PATCH 10/18] [GFS2] Remove all_list from lock_dlm swhiteho
0 siblings, 1 reply; 35+ messages in thread
From: swhiteho @ 2008-07-11 10:11 UTC (permalink / raw)
To: cluster-devel.redhat.com
From: Steven Whitehouse <swhiteho@redhat.com>
This is only used by GFS1 so can be removed.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
diff --git a/fs/gfs2/locking/dlm/lock.c b/fs/gfs2/locking/dlm/lock.c
index 871ffc9..894df45 100644
--- a/fs/gfs2/locking/dlm/lock.c
+++ b/fs/gfs2/locking/dlm/lock.c
@@ -80,7 +80,6 @@ static void process_complete(struct gdlm_lock *lp)
{
struct gdlm_ls *ls = lp->ls;
struct lm_async_cb acb;
- s16 prev_mode = lp->cur;
memset(&acb, 0, sizeof(acb));
@@ -160,15 +159,7 @@ static void process_complete(struct gdlm_lock *lp)
lp->lksb.sb_status, lp->lockname.ln_type,
(unsigned long long)lp->lockname.ln_number,
lp->flags);
- if (lp->lksb.sb_status == -EDEADLOCK &&
- lp->ls->fsflags & LM_MFLAG_CONV_NODROP) {
- lp->req = lp->cur;
- acb.lc_ret |= LM_OUT_CONV_DEADLK;
- if (lp->cur == DLM_LOCK_IV)
- lp->lksb.sb_lkid = 0;
- goto out;
- } else
- return;
+ return;
}
/*
@@ -268,10 +259,6 @@ out:
acb.lc_name = lp->lockname;
acb.lc_ret |= gdlm_make_lmstate(lp->cur);
- if (!test_and_clear_bit(LFL_NOCACHE, &lp->flags) &&
- (lp->cur > DLM_LOCK_NL) && (prev_mode > DLM_LOCK_NL))
- acb.lc_ret |= LM_OUT_CACHEABLE;
-
ls->fscb(ls->sdp, LM_CB_ASYNC, &acb);
}
@@ -376,14 +363,6 @@ static inline unsigned int make_flags(struct gdlm_lock *lp,
if (lp->lksb.sb_lkid != 0) {
lkf |= DLM_LKF_CONVERT;
-
- /* Conversion deadlock avoidance by DLM */
-
- if (!(lp->ls->fsflags & LM_MFLAG_CONV_NODROP) &&
- !test_bit(LFL_FORCE_PROMOTE, &lp->flags) &&
- !(lkf & DLM_LKF_NOQUEUE) &&
- cur > DLM_LOCK_NL && req > DLM_LOCK_NL && cur != req)
- lkf |= DLM_LKF_CONVDEADLK;
}
if (lp->lvb)
diff --git a/include/linux/lm_interface.h b/include/linux/lm_interface.h
index d0a7112..2ed8fa1 100644
--- a/include/linux/lm_interface.h
+++ b/include/linux/lm_interface.h
@@ -122,11 +122,9 @@ typedef void (*lm_callback_t) (void *ptr, unsigned int type, void *data);
*/
#define LM_OUT_ST_MASK 0x00000003
-#define LM_OUT_CACHEABLE 0x00000004
#define LM_OUT_CANCELED 0x00000008
#define LM_OUT_ASYNC 0x00000080
#define LM_OUT_ERROR 0x00000100
-#define LM_OUT_CONV_DEADLK 0x00000200
/*
* lm_callback_t types
--
1.5.1.2
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [Cluster-devel] [PATCH 10/18] [GFS2] Remove all_list from lock_dlm
2008-07-11 10:11 ` [Cluster-devel] [PATCH 09/18] [GFS2] Remove obsolete conversion deadlock avoidance code swhiteho
@ 2008-07-11 10:11 ` swhiteho
2008-07-11 10:11 ` [Cluster-devel] [PATCH 11/18] [GFS2] Glock documentation swhiteho
0 siblings, 1 reply; 35+ messages in thread
From: swhiteho @ 2008-07-11 10:11 UTC (permalink / raw)
To: cluster-devel.redhat.com
From: Steven Whitehouse <swhiteho@redhat.com>
I discovered that we had a list onto which every lock_dlm
lock was being put. Its only function was to discover whether
we'd got any locks left after umount. Since there was already
a counter for that purpose as well, I removed the list. The
saving is sizeof(struct list_head) per glock - well worth
having.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
diff --git a/fs/gfs2/locking/dlm/lock.c b/fs/gfs2/locking/dlm/lock.c
index 894df45..2482c90 100644
--- a/fs/gfs2/locking/dlm/lock.c
+++ b/fs/gfs2/locking/dlm/lock.c
@@ -58,9 +58,6 @@ static void gdlm_delete_lp(struct gdlm_lock *lp)
spin_lock(&ls->async_lock);
if (!list_empty(&lp->delay_list))
list_del_init(&lp->delay_list);
- gdlm_assert(!list_empty(&lp->all_list), "%x,%llx", lp->lockname.ln_type,
- (unsigned long long)lp->lockname.ln_number);
- list_del_init(&lp->all_list);
ls->all_locks_count--;
spin_unlock(&ls->async_lock);
@@ -397,7 +394,6 @@ static int gdlm_create_lp(struct gdlm_ls *ls, struct lm_lockname *name,
INIT_LIST_HEAD(&lp->delay_list);
spin_lock(&ls->async_lock);
- list_add(&lp->all_list, &ls->all_locks);
ls->all_locks_count++;
spin_unlock(&ls->async_lock);
@@ -710,22 +706,3 @@ void gdlm_submit_delayed(struct gdlm_ls *ls)
wake_up(&ls->thread_wait);
}
-int gdlm_release_all_locks(struct gdlm_ls *ls)
-{
- struct gdlm_lock *lp, *safe;
- int count = 0;
-
- spin_lock(&ls->async_lock);
- list_for_each_entry_safe(lp, safe, &ls->all_locks, all_list) {
- list_del_init(&lp->all_list);
-
- if (lp->lvb && lp->lvb != junk_lvb)
- kfree(lp->lvb);
- kfree(lp);
- count++;
- }
- spin_unlock(&ls->async_lock);
-
- return count;
-}
-
diff --git a/fs/gfs2/locking/dlm/lock_dlm.h b/fs/gfs2/locking/dlm/lock_dlm.h
index 845a27f..21cf466 100644
--- a/fs/gfs2/locking/dlm/lock_dlm.h
+++ b/fs/gfs2/locking/dlm/lock_dlm.h
@@ -74,7 +74,6 @@ struct gdlm_ls {
spinlock_t async_lock;
struct list_head delayed;
struct list_head submit;
- struct list_head all_locks;
u32 all_locks_count;
wait_queue_head_t wait_control;
struct task_struct *thread;
@@ -112,7 +111,6 @@ struct gdlm_lock {
unsigned long flags; /* lock_dlm flags LFL_ */
struct list_head delay_list; /* delayed */
- struct list_head all_list; /* all locks for the fs */
struct gdlm_lock *hold_null; /* NL lock for hold_lvb */
};
diff --git a/fs/gfs2/locking/dlm/mount.c b/fs/gfs2/locking/dlm/mount.c
index fa31c54..947e706 100644
--- a/fs/gfs2/locking/dlm/mount.c
+++ b/fs/gfs2/locking/dlm/mount.c
@@ -28,7 +28,6 @@ static struct gdlm_ls *init_gdlm(lm_callback_t cb, struct gfs2_sbd *sdp,
spin_lock_init(&ls->async_lock);
INIT_LIST_HEAD(&ls->delayed);
INIT_LIST_HEAD(&ls->submit);
- INIT_LIST_HEAD(&ls->all_locks);
init_waitqueue_head(&ls->thread_wait);
init_waitqueue_head(&ls->wait_control);
ls->jid = -1;
@@ -173,7 +172,6 @@ out:
static void gdlm_unmount(void *lockspace)
{
struct gdlm_ls *ls = lockspace;
- int rv;
log_debug("unmount flags %lx", ls->flags);
@@ -187,9 +185,7 @@ static void gdlm_unmount(void *lockspace)
gdlm_kobject_release(ls);
dlm_release_lockspace(ls->dlm_lockspace, 2);
gdlm_release_threads(ls);
- rv = gdlm_release_all_locks(ls);
- if (rv)
- log_info("gdlm_unmount: %d stray locks freed", rv);
+ BUG_ON(ls->all_locks_count);
out:
kfree(ls);
}
--
1.5.1.2
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [Cluster-devel] [PATCH 11/18] [GFS2] Glock documentation
2008-07-11 10:11 ` [Cluster-devel] [PATCH 10/18] [GFS2] Remove all_list from lock_dlm swhiteho
@ 2008-07-11 10:11 ` swhiteho
2008-07-11 10:11 ` [Cluster-devel] [PATCH 12/18] [GFS2] Fix module building swhiteho
0 siblings, 1 reply; 35+ messages in thread
From: swhiteho @ 2008-07-11 10:11 UTC (permalink / raw)
To: cluster-devel.redhat.com
From: Steven Whitehouse <swhiteho@redhat.com>
This patch adds a file describing the internals of GFS2's glock
abstraction.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
diff --git a/Documentation/filesystems/gfs2-glocks.txt b/Documentation/filesystems/gfs2-glocks.txt
new file mode 100644
index 0000000..4dae9a3
--- /dev/null
+++ b/Documentation/filesystems/gfs2-glocks.txt
@@ -0,0 +1,114 @@
+ Glock internal locking rules
+ ------------------------------
+
+This documents the basic principles of the glock state machine
+internals. Each glock (struct gfs2_glock in fs/gfs2/incore.h)
+has two main (internal) locks:
+
+ 1. A spinlock (gl_spin) which protects the internal state such
+ as gl_state, gl_target and the list of holders (gl_holders)
+ 2. A non-blocking bit lock, GLF_LOCK, which is used to prevent other
+ threads from making calls to the DLM, etc. at the same time. If a
+ thread takes this lock, it must then call run_queue (usually via the
+ workqueue) when it releases it in order to ensure any pending tasks
+ are completed.
+
+The gl_holders list contains all the queued lock requests (not
+just the holders) associated with the glock. If there are any
+held locks, then they will be contiguous entries at the head
+of the list. Locks are granted in strictly the order that they
+are queued, except for those marked LM_FLAG_PRIORITY which are
+used only during recovery, and even then only for journal locks.
+
+There are three lock states that users of the glock layer can request,
+namely shared (SH), deferred (DF) and exclusive (EX). Those translate
+to the following DLM lock modes:
+
+Glock mode | DLM lock mode
+------------------------------
+ UN | IV/NL Unlocked (no DLM lock associated with glock) or NL
+ SH | PR (Protected read)
+ DF | CW (Concurrent write)
+ EX | EX (Exclusive)
+
+Thus DF is basically a shared mode which is incompatible with the "normal"
+shared lock mode, SH. In GFS2 the DF mode is used exclusively for direct I/O
+operations. The glocks are basically a lock plus some routines which deal
+with cache management. The following rules apply for the cache:
+
+Glock mode | Cache data | Cache Metadata | Dirty Data | Dirty Metadata
+--------------------------------------------------------------------------
+ UN | No | No | No | No
+ SH | Yes | Yes | No | No
+ DF | No | Yes | No | No
+ EX | Yes | Yes | Yes | Yes
+
+These rules are implemented using the various glock operations which
+are defined for each type of glock. Not all types of glocks use
+all the modes. Only inode glocks use the DF mode for example.
+
+Table of glock operations and per type constants:
+
+Field | Purpose
+----------------------------------------------------------------------------
+go_xmote_th | Called before remote state change (e.g. to sync dirty data)
+go_xmote_bh | Called after remote state change (e.g. to refill cache)
+go_inval | Called if remote state change requires invalidating the cache
+go_demote_ok | Returns boolean value of whether its ok to demote a glock
+ | (e.g. checks timeout, and that there is no cached data)
+go_lock | Called for the first local holder of a lock
+go_unlock | Called on the final local unlock of a lock
+go_dump | Called to print content of object for debugfs file, or on
+ | error to dump glock to the log.
+go_type; | The type of the glock, LM_TYPE_.....
+go_min_hold_time | The minimum hold time
+
+The minimum hold time for each lock is the time after a remote lock
+grant for which we ignore remote demote requests. This is in order to
+prevent a situation where locks are being bounced around the cluster
+from node to node with none of the nodes making any progress. This
+tends to show up most with shared mmaped files which are being written
+to by multiple nodes. By delaying the demotion in response to a
+remote callback, that gives the userspace program time to make
+some progress before the pages are unmapped.
+
+There is a plan to try and remove the go_lock and go_unlock callbacks
+if possible, in order to try and speed up the fast path though the locking.
+Also, eventually we hope to make the glock "EX" mode locally shared
+such that any local locking will be done with the i_mutex as required
+rather than via the glock.
+
+Locking rules for glock operations:
+
+Operation | GLF_LOCK bit lock held | gl_spin spinlock held
+-----------------------------------------------------------------
+go_xmote_th | Yes | No
+go_xmote_bh | Yes | No
+go_inval | Yes | No
+go_demote_ok | Sometimes | Yes
+go_lock | Yes | No
+go_unlock | Yes | No
+go_dump | Sometimes | Yes
+
+N.B. Operations must not drop either the bit lock or the spinlock
+if its held on entry. go_dump and do_demote_ok must never block.
+Note that go_dump will only be called if the glock's state
+indicates that it is caching uptodate data.
+
+Glock locking order within GFS2:
+
+ 1. i_mutex (if required)
+ 2. Rename glock (for rename only)
+ 3. Inode glock(s)
+ (Parents before children, inodes at "same level" with same parent in
+ lock number order)
+ 4. Rgrp glock(s) (for (de)allocation operations)
+ 5. Transaction glock (via gfs2_trans_begin) for non-read operations
+ 6. Page lock (always last, very important!)
+
+There are two glocks per inode. One deals with access to the inode
+itself (locking order as above), and the other, known as the iopen
+glock is used in conjunction with the i_nlink field in the inode to
+determine the lifetime of the inode in question. Locking of inodes
+is on a per-inode basis. Locking of rgrps is on a per rgrp basis.
+
--
1.5.1.2
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [Cluster-devel] [PATCH 12/18] [GFS2] Fix module building
2008-07-11 10:11 ` [Cluster-devel] [PATCH 11/18] [GFS2] Glock documentation swhiteho
@ 2008-07-11 10:11 ` swhiteho
2008-07-11 10:11 ` [Cluster-devel] [PATCH 13/18] [GFS2] don't call permission() swhiteho
0 siblings, 1 reply; 35+ messages in thread
From: swhiteho @ 2008-07-11 10:11 UTC (permalink / raw)
To: cluster-devel.redhat.com
From: Steven Whitehouse <swhiteho@redhat.com>
Two lines missed from the previous patch.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
diff --git a/fs/gfs2/locking/dlm/lock_dlm.h b/fs/gfs2/locking/dlm/lock_dlm.h
index 21cf466..3c98e7c 100644
--- a/fs/gfs2/locking/dlm/lock_dlm.h
+++ b/fs/gfs2/locking/dlm/lock_dlm.h
@@ -148,7 +148,6 @@ void gdlm_release_threads(struct gdlm_ls *);
/* lock.c */
void gdlm_submit_delayed(struct gdlm_ls *);
-int gdlm_release_all_locks(struct gdlm_ls *);
unsigned int gdlm_do_lock(struct gdlm_lock *);
int gdlm_get_lock(void *, struct lm_lockname *, void **);
diff --git a/fs/gfs2/locking/dlm/mount.c b/fs/gfs2/locking/dlm/mount.c
index 947e706..09d78c2 100644
--- a/fs/gfs2/locking/dlm/mount.c
+++ b/fs/gfs2/locking/dlm/mount.c
@@ -221,7 +221,6 @@ static void gdlm_withdraw(void *lockspace)
dlm_release_lockspace(ls->dlm_lockspace, 2);
gdlm_release_threads(ls);
- gdlm_release_all_locks(ls);
gdlm_kobject_release(ls);
}
--
1.5.1.2
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [Cluster-devel] [PATCH 13/18] [GFS2] don't call permission()
2008-07-11 10:11 ` [Cluster-devel] [PATCH 12/18] [GFS2] Fix module building swhiteho
@ 2008-07-11 10:11 ` swhiteho
2008-07-11 10:11 ` [Cluster-devel] [PATCH 14/18] [GFS2] Fix delayed demote race swhiteho
0 siblings, 1 reply; 35+ messages in thread
From: swhiteho @ 2008-07-11 10:11 UTC (permalink / raw)
To: cluster-devel.redhat.com
From: Miklos Szeredi <miklos@szeredi.hu>
GFS2 calls permission() to verify permissions after locks on the files
have been taken.
For this it's sufficient to call gfs2_permission() instead. This
results in the following changes:
- IS_RDONLY() check is not performed
- IS_IMMUTABLE() check is not performed
- devcgroup_inode_permission() is not called
- security_inode_permission() is not called
IS_RDONLY() should be unnecessary anyway, as the per-mount read-only
flag should provide protection against read-only remounts during
operations. do_gfs2_set_flags() has been fixed to perform
mnt_want_write()/mnt_drop_write() to protect against remounting
read-only.
IS_IMMUTABLE has been added to gfs2_permission()
Repeating the security checks seems to be pointless, as they don't
normally change, and if they do, it's independent of the filesystem
state.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index 09453d0..caf4090 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -504,7 +504,7 @@ struct inode *gfs2_lookupi(struct inode *dir, const struct qstr *name,
}
if (!is_root) {
- error = permission(dir, MAY_EXEC, NULL);
+ error = gfs2_permission(dir, MAY_EXEC);
if (error)
goto out;
}
@@ -667,7 +667,7 @@ static int create_ok(struct gfs2_inode *dip, const struct qstr *name,
{
int error;
- error = permission(&dip->i_inode, MAY_WRITE | MAY_EXEC, NULL);
+ error = gfs2_permission(&dip->i_inode, MAY_WRITE | MAY_EXEC);
if (error)
return error;
@@ -1134,7 +1134,7 @@ int gfs2_unlink_ok(struct gfs2_inode *dip, const struct qstr *name,
if (IS_APPEND(&dip->i_inode))
return -EPERM;
- error = permission(&dip->i_inode, MAY_WRITE | MAY_EXEC, NULL);
+ error = gfs2_permission(&dip->i_inode, MAY_WRITE | MAY_EXEC);
if (error)
return error;
diff --git a/fs/gfs2/inode.h b/fs/gfs2/inode.h
index 580da45..04e9fef 100644
--- a/fs/gfs2/inode.h
+++ b/fs/gfs2/inode.h
@@ -91,6 +91,7 @@ int gfs2_rmdiri(struct gfs2_inode *dip, const struct qstr *name,
struct gfs2_inode *ip);
int gfs2_unlink_ok(struct gfs2_inode *dip, const struct qstr *name,
const struct gfs2_inode *ip);
+int gfs2_permission(struct inode *inode, int mask);
int gfs2_ok_to_move(struct gfs2_inode *this, struct gfs2_inode *to);
int gfs2_readlinki(struct gfs2_inode *ip, char **buf, unsigned int *len);
int gfs2_glock_nq_atime(struct gfs2_holder *gh);
diff --git a/fs/gfs2/ops_file.c b/fs/gfs2/ops_file.c
index 0ff512a..1737af9 100644
--- a/fs/gfs2/ops_file.c
+++ b/fs/gfs2/ops_file.c
@@ -15,6 +15,7 @@
#include <linux/uio.h>
#include <linux/blkdev.h>
#include <linux/mm.h>
+#include <linux/mount.h>
#include <linux/fs.h>
#include <linux/gfs2_ondisk.h>
#include <linux/ext2_fs.h>
@@ -220,10 +221,14 @@ static int do_gfs2_set_flags(struct file *filp, u32 reqflags, u32 mask)
int error;
u32 new_flags, flags;
- error = gfs2_glock_nq_init(ip->i_gl, LM_ST_EXCLUSIVE, 0, &gh);
+ error = mnt_want_write(filp->f_path.mnt);
if (error)
return error;
+ error = gfs2_glock_nq_init(ip->i_gl, LM_ST_EXCLUSIVE, 0, &gh);
+ if (error)
+ goto out_drop_write;
+
flags = ip->i_di.di_flags;
new_flags = (flags & ~mask) | (reqflags & mask);
if ((new_flags ^ flags) == 0)
@@ -242,7 +247,7 @@ static int do_gfs2_set_flags(struct file *filp, u32 reqflags, u32 mask)
!capable(CAP_LINUX_IMMUTABLE))
goto out;
if (!IS_IMMUTABLE(inode)) {
- error = permission(inode, MAY_WRITE, NULL);
+ error = gfs2_permission(inode, MAY_WRITE);
if (error)
goto out;
}
@@ -272,6 +277,8 @@ out_trans_end:
gfs2_trans_end(sdp);
out:
gfs2_glock_dq_uninit(&gh);
+out_drop_write:
+ mnt_drop_write(filp->f_path.mnt);
return error;
}
diff --git a/fs/gfs2/ops_inode.c b/fs/gfs2/ops_inode.c
index 2686ad4..1e252df 100644
--- a/fs/gfs2/ops_inode.c
+++ b/fs/gfs2/ops_inode.c
@@ -163,7 +163,7 @@ static int gfs2_link(struct dentry *old_dentry, struct inode *dir,
if (error)
goto out;
- error = permission(dir, MAY_WRITE | MAY_EXEC, NULL);
+ error = gfs2_permission(dir, MAY_WRITE | MAY_EXEC);
if (error)
goto out_gunlock;
@@ -669,7 +669,7 @@ static int gfs2_rename(struct inode *odir, struct dentry *odentry,
}
}
} else {
- error = permission(ndir, MAY_WRITE | MAY_EXEC, NULL);
+ error = gfs2_permission(ndir, MAY_WRITE | MAY_EXEC);
if (error)
goto out_gunlock;
@@ -704,7 +704,7 @@ static int gfs2_rename(struct inode *odir, struct dentry *odentry,
/* Check out the dir to be renamed */
if (dir_rename) {
- error = permission(odentry->d_inode, MAY_WRITE, NULL);
+ error = gfs2_permission(odentry->d_inode, MAY_WRITE);
if (error)
goto out_gunlock;
}
@@ -891,7 +891,7 @@ static void *gfs2_follow_link(struct dentry *dentry, struct nameidata *nd)
* Returns: errno
*/
-static int gfs2_permission(struct inode *inode, int mask, struct nameidata *nd)
+int gfs2_permission(struct inode *inode, int mask)
{
struct gfs2_inode *ip = GFS2_I(inode);
struct gfs2_holder i_gh;
@@ -905,13 +905,22 @@ static int gfs2_permission(struct inode *inode, int mask, struct nameidata *nd)
unlock = 1;
}
- error = generic_permission(inode, mask, gfs2_check_acl);
+ if ((mask & MAY_WRITE) && IS_IMMUTABLE(inode))
+ error = -EACCES;
+ else
+ error = generic_permission(inode, mask, gfs2_check_acl);
if (unlock)
gfs2_glock_dq_uninit(&i_gh);
return error;
}
+static int gfs2_iop_permission(struct inode *inode, int mask,
+ struct nameidata *nd)
+{
+ return gfs2_permission(inode, mask);
+}
+
static int setattr_size(struct inode *inode, struct iattr *attr)
{
struct gfs2_inode *ip = GFS2_I(inode);
@@ -1141,7 +1150,7 @@ static int gfs2_removexattr(struct dentry *dentry, const char *name)
}
const struct inode_operations gfs2_file_iops = {
- .permission = gfs2_permission,
+ .permission = gfs2_iop_permission,
.setattr = gfs2_setattr,
.getattr = gfs2_getattr,
.setxattr = gfs2_setxattr,
@@ -1160,7 +1169,7 @@ const struct inode_operations gfs2_dir_iops = {
.rmdir = gfs2_rmdir,
.mknod = gfs2_mknod,
.rename = gfs2_rename,
- .permission = gfs2_permission,
+ .permission = gfs2_iop_permission,
.setattr = gfs2_setattr,
.getattr = gfs2_getattr,
.setxattr = gfs2_setxattr,
@@ -1172,7 +1181,7 @@ const struct inode_operations gfs2_dir_iops = {
const struct inode_operations gfs2_symlink_iops = {
.readlink = gfs2_readlink,
.follow_link = gfs2_follow_link,
- .permission = gfs2_permission,
+ .permission = gfs2_iop_permission,
.setattr = gfs2_setattr,
.getattr = gfs2_getattr,
.setxattr = gfs2_setxattr,
--
1.5.1.2
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [Cluster-devel] [PATCH 14/18] [GFS2] Fix delayed demote race
2008-07-11 10:11 ` [Cluster-devel] [PATCH 13/18] [GFS2] don't call permission() swhiteho
@ 2008-07-11 10:11 ` swhiteho
2008-07-11 10:11 ` [Cluster-devel] [PATCH 15/18] [GFS2] Allow local DF locks when holding a cached EX glock swhiteho
0 siblings, 1 reply; 35+ messages in thread
From: swhiteho @ 2008-07-11 10:11 UTC (permalink / raw)
To: cluster-devel.redhat.com
From: Steven Whitehouse <swhiteho@redhat.com>
There is a race in the delayed demote code where it does the wrong thing
if a demotion to UN has occurred for other reasons before the delay has
expired. This patch adds an assert to catch that condition as well as
fixing the root cause by adding an additional check for the UN state.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Bob Peterson <rpeterso@redhat.com>
diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index 8d5450f..cd0aa21 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -587,6 +587,7 @@ static void run_queue(struct gfs2_glock *gl, const int nonblock)
if (nonblock)
goto out_sched;
set_bit(GLF_DEMOTE_IN_PROGRESS, &gl->gl_flags);
+ GLOCK_BUG_ON(gl, gl->gl_demote_state == LM_ST_EXCLUSIVE);
gl->gl_target = gl->gl_demote_state;
} else {
if (test_bit(GLF_DEMOTE, &gl->gl_flags))
@@ -617,7 +618,9 @@ static void glock_work_func(struct work_struct *work)
if (test_and_clear_bit(GLF_REPLY_PENDING, &gl->gl_flags))
finish_xmote(gl, gl->gl_reply);
spin_lock(&gl->gl_spin);
- if (test_and_clear_bit(GLF_PENDING_DEMOTE, &gl->gl_flags)) {
+ if (test_and_clear_bit(GLF_PENDING_DEMOTE, &gl->gl_flags) &&
+ gl->gl_state != LM_ST_UNLOCKED &&
+ gl->gl_demote_state != LM_ST_EXCLUSIVE) {
unsigned long holdtime, now = jiffies;
holdtime = gl->gl_tchange + gl->gl_ops->go_min_hold_time;
if (time_before(now, holdtime))
--
1.5.1.2
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [Cluster-devel] [PATCH 15/18] [GFS2] Allow local DF locks when holding a cached EX glock
2008-07-11 10:11 ` [Cluster-devel] [PATCH 14/18] [GFS2] Fix delayed demote race swhiteho
@ 2008-07-11 10:11 ` swhiteho
2008-07-11 10:11 ` [Cluster-devel] [PATCH 16/18] [GFS2] Replace rgrp "recent list" with mru list swhiteho
0 siblings, 1 reply; 35+ messages in thread
From: swhiteho @ 2008-07-11 10:11 UTC (permalink / raw)
To: cluster-devel.redhat.com
From: Steven Whitehouse <swhiteho@redhat.com>
We already allow local SH locks while we hold a cached EX glock, so here
we allow DF locks as well. This works only because we rely on the VFS's
invalidation for locally cached data, and because if we hold an EX lock,
then we know that no other node can be caching data relating to this
file.
It dramatically speeds up initial writes to O_DIRECT files since we fall
back to buffered I/O for this and would otherwise bounce between DF and
EX modes on each and every write call. The lessons to be learned from
that are to ensure that (for the time being anyway) O_DIRECT files are
preallocated and that they are written to using reasonably large I/O
sizes. Even so this change fixes that corner case nicely
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index cd0aa21..13391e5 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -267,8 +267,12 @@ static inline int may_grant(const struct gfs2_glock *gl, const struct gfs2_holde
return 1;
if (gh->gh_flags & GL_EXACT)
return 0;
- if (gh->gh_state == LM_ST_SHARED && gl->gl_state == LM_ST_EXCLUSIVE)
- return 1;
+ if (gl->gl_state == LM_ST_EXCLUSIVE) {
+ if (gh->gh_state == LM_ST_SHARED && gh_head->gh_state == LM_ST_SHARED)
+ return 1;
+ if (gh->gh_state == LM_ST_DEFERRED && gh_head->gh_state == LM_ST_DEFERRED)
+ return 1;
+ }
if (gl->gl_state != LM_ST_UNLOCKED && (gh->gh_flags & LM_FLAG_ANY))
return 1;
return 0;
--
1.5.1.2
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [Cluster-devel] [PATCH 16/18] [GFS2] Replace rgrp "recent list" with mru list
2008-07-11 10:11 ` [Cluster-devel] [PATCH 15/18] [GFS2] Allow local DF locks when holding a cached EX glock swhiteho
@ 2008-07-11 10:11 ` swhiteho
2008-07-11 10:11 ` [Cluster-devel] [PATCH 17/18] [GFS2] Remove support for unused and pointless flag swhiteho
0 siblings, 1 reply; 35+ messages in thread
From: swhiteho @ 2008-07-11 10:11 UTC (permalink / raw)
To: cluster-devel.redhat.com
From: Steven Whitehouse <swhiteho@redhat.com>
This patch removes the "recent list" which is used during allocation
and replaces it with the (already existing) mru list used during
deletion. The "recent list" was not a true mru list leading to a number
of inefficiencies including a "next" function which made scanning the
list an order N^2 operation wrt to the number of list elements.
This should increase allocation performance with large numbers of rgrps.
Its also a useful preparation and cleanup before some further changes
which are planned in this area.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h
index 4b734c6..4ab3c3a 100644
--- a/fs/gfs2/incore.h
+++ b/fs/gfs2/incore.h
@@ -77,7 +77,6 @@ struct gfs2_rgrp_host {
struct gfs2_rgrpd {
struct list_head rd_list; /* Link with superblock */
struct list_head rd_list_mru;
- struct list_head rd_recent; /* Recently used rgrps */
struct gfs2_glock *rd_gl; /* Glock for this rgrp */
u64 rd_addr; /* grp block disk address */
u64 rd_data0; /* first data location */
@@ -529,7 +528,6 @@ struct gfs2_sbd {
struct mutex sd_rindex_mutex;
struct list_head sd_rindex_list;
struct list_head sd_rindex_mru_list;
- struct list_head sd_rindex_recent_list;
struct gfs2_rgrpd *sd_rindex_forward;
unsigned int sd_rgrps;
diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index 6ba69dd..b4d1d64 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -64,7 +64,6 @@ static struct gfs2_sbd *init_sbd(struct super_block *sb)
mutex_init(&sdp->sd_rindex_mutex);
INIT_LIST_HEAD(&sdp->sd_rindex_list);
INIT_LIST_HEAD(&sdp->sd_rindex_mru_list);
- INIT_LIST_HEAD(&sdp->sd_rindex_recent_list);
INIT_LIST_HEAD(&sdp->sd_jindex_list);
spin_lock_init(&sdp->sd_jindex_spin);
diff --git a/fs/gfs2/rgrp.c b/fs/gfs2/rgrp.c
index 3401628..2d90fb2 100644
--- a/fs/gfs2/rgrp.c
+++ b/fs/gfs2/rgrp.c
@@ -371,11 +371,6 @@ static void clear_rgrpdi(struct gfs2_sbd *sdp)
spin_lock(&sdp->sd_rindex_spin);
sdp->sd_rindex_forward = NULL;
- head = &sdp->sd_rindex_recent_list;
- while (!list_empty(head)) {
- rgd = list_entry(head->next, struct gfs2_rgrpd, rd_recent);
- list_del(&rgd->rd_recent);
- }
spin_unlock(&sdp->sd_rindex_spin);
head = &sdp->sd_rindex_list;
@@ -945,107 +940,30 @@ static struct inode *try_rgrp_unlink(struct gfs2_rgrpd *rgd, u64 *last_unlinked)
}
/**
- * recent_rgrp_first - get first RG from "recent" list
- * @sdp: The GFS2 superblock
- * @rglast: address of the rgrp used last
- *
- * Returns: The first rgrp in the recent list
- */
-
-static struct gfs2_rgrpd *recent_rgrp_first(struct gfs2_sbd *sdp,
- u64 rglast)
-{
- struct gfs2_rgrpd *rgd;
-
- spin_lock(&sdp->sd_rindex_spin);
-
- if (rglast) {
- list_for_each_entry(rgd, &sdp->sd_rindex_recent_list, rd_recent) {
- if (rgrp_contains_block(rgd, rglast))
- goto out;
- }
- }
- rgd = NULL;
- if (!list_empty(&sdp->sd_rindex_recent_list))
- rgd = list_entry(sdp->sd_rindex_recent_list.next,
- struct gfs2_rgrpd, rd_recent);
-out:
- spin_unlock(&sdp->sd_rindex_spin);
- return rgd;
-}
-
-/**
* recent_rgrp_next - get next RG from "recent" list
* @cur_rgd: current rgrp
- * @remove:
*
* Returns: The next rgrp in the recent list
*/
-static struct gfs2_rgrpd *recent_rgrp_next(struct gfs2_rgrpd *cur_rgd,
- int remove)
+static struct gfs2_rgrpd *recent_rgrp_next(struct gfs2_rgrpd *cur_rgd)
{
struct gfs2_sbd *sdp = cur_rgd->rd_sbd;
struct list_head *head;
struct gfs2_rgrpd *rgd;
spin_lock(&sdp->sd_rindex_spin);
-
- head = &sdp->sd_rindex_recent_list;
-
- list_for_each_entry(rgd, head, rd_recent) {
- if (rgd == cur_rgd) {
- if (cur_rgd->rd_recent.next != head)
- rgd = list_entry(cur_rgd->rd_recent.next,
- struct gfs2_rgrpd, rd_recent);
- else
- rgd = NULL;
-
- if (remove)
- list_del(&cur_rgd->rd_recent);
-
- goto out;
- }
+ head = &sdp->sd_rindex_mru_list;
+ if (unlikely(cur_rgd->rd_list_mru.next == head)) {
+ spin_unlock(&sdp->sd_rindex_spin);
+ return NULL;
}
-
- rgd = NULL;
- if (!list_empty(head))
- rgd = list_entry(head->next, struct gfs2_rgrpd, rd_recent);
-
-out:
+ rgd = list_entry(cur_rgd->rd_list_mru.next, struct gfs2_rgrpd, rd_list_mru);
spin_unlock(&sdp->sd_rindex_spin);
return rgd;
}
/**
- * recent_rgrp_add - add an RG to tail of "recent" list
- * @new_rgd: The rgrp to add
- *
- */
-
-static void recent_rgrp_add(struct gfs2_rgrpd *new_rgd)
-{
- struct gfs2_sbd *sdp = new_rgd->rd_sbd;
- struct gfs2_rgrpd *rgd;
- unsigned int count = 0;
- unsigned int max = sdp->sd_rgrps / gfs2_jindex_size(sdp);
-
- spin_lock(&sdp->sd_rindex_spin);
-
- list_for_each_entry(rgd, &sdp->sd_rindex_recent_list, rd_recent) {
- if (rgd == new_rgd)
- goto out;
-
- if (++count >= max)
- goto out;
- }
- list_add_tail(&new_rgd->rd_recent, &sdp->sd_rindex_recent_list);
-
-out:
- spin_unlock(&sdp->sd_rindex_spin);
-}
-
-/**
* forward_rgrp_get - get an rgrp to try next from full list
* @sdp: The GFS2 superblock
*
@@ -1112,9 +1030,7 @@ static struct inode *get_local_rgrp(struct gfs2_inode *ip, u64 *last_unlinked)
int loops = 0;
int error, rg_locked;
- /* Try recently successful rgrps */
-
- rgd = recent_rgrp_first(sdp, ip->i_goal);
+ rgd = gfs2_blk2rgrpd(sdp, ip->i_goal);
while (rgd) {
rg_locked = 0;
@@ -1136,11 +1052,9 @@ static struct inode *get_local_rgrp(struct gfs2_inode *ip, u64 *last_unlinked)
gfs2_glock_dq_uninit(&al->al_rgd_gh);
if (inode)
return inode;
- rgd = recent_rgrp_next(rgd, 1);
- break;
-
+ /* fall through */
case GLR_TRYFAILED:
- rgd = recent_rgrp_next(rgd, 0);
+ rgd = recent_rgrp_next(rgd);
break;
default:
@@ -1199,7 +1113,9 @@ static struct inode *get_local_rgrp(struct gfs2_inode *ip, u64 *last_unlinked)
out:
if (begin) {
- recent_rgrp_add(rgd);
+ spin_lock(&sdp->sd_rindex_spin);
+ list_move(&rgd->rd_list_mru, &sdp->sd_rindex_mru_list);
+ spin_unlock(&sdp->sd_rindex_spin);
rgd = gfs2_rgrpd_get_next(rgd);
if (!rgd)
rgd = gfs2_rgrpd_get_first(sdp);
--
1.5.1.2
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [Cluster-devel] [PATCH 17/18] [GFS2] Remove support for unused and pointless flag
2008-07-11 10:11 ` [Cluster-devel] [PATCH 16/18] [GFS2] Replace rgrp "recent list" with mru list swhiteho
@ 2008-07-11 10:11 ` swhiteho
2008-07-11 10:11 ` [Cluster-devel] [PATCH 18/18] [GFS2] Remove unused declaration swhiteho
[not found] ` <Pine.LNX.4.61.0807110714110.18083@chaos.analogic.com>
0 siblings, 2 replies; 35+ messages in thread
From: swhiteho @ 2008-07-11 10:11 UTC (permalink / raw)
To: cluster-devel.redhat.com
From: Steven Whitehouse <swhiteho@redhat.com>
The ability to mark files for direct i/o access when opened
normally is both unused and pointless, so this patch removes
support for that feature.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h
index 4ab3c3a..448697a 100644
--- a/fs/gfs2/incore.h
+++ b/fs/gfs2/incore.h
@@ -421,7 +421,6 @@ struct gfs2_tune {
unsigned int gt_quota_quantum; /* Secs between syncs to quota file */
unsigned int gt_atime_quantum; /* Min secs between atime updates */
unsigned int gt_new_files_jdata;
- unsigned int gt_new_files_directio;
unsigned int gt_max_readahead; /* Max bytes to read-ahead from disk */
unsigned int gt_stall_secs; /* Detects trouble! */
unsigned int gt_complain_secs;
diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index caf4090..6da0ab3 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -789,13 +789,8 @@ static void init_dinode(struct gfs2_inode *dip, struct gfs2_glock *gl,
if ((dip->i_di.di_flags & GFS2_DIF_INHERIT_JDATA) ||
gfs2_tune_get(sdp, gt_new_files_jdata))
di->di_flags |= cpu_to_be32(GFS2_DIF_JDATA);
- if ((dip->i_di.di_flags & GFS2_DIF_INHERIT_DIRECTIO) ||
- gfs2_tune_get(sdp, gt_new_files_directio))
- di->di_flags |= cpu_to_be32(GFS2_DIF_DIRECTIO);
} else if (S_ISDIR(mode)) {
di->di_flags |= cpu_to_be32(dip->i_di.di_flags &
- GFS2_DIF_INHERIT_DIRECTIO);
- di->di_flags |= cpu_to_be32(dip->i_di.di_flags &
GFS2_DIF_INHERIT_JDATA);
}
diff --git a/fs/gfs2/ops_file.c b/fs/gfs2/ops_file.c
index 1737af9..21b397d 100644
--- a/fs/gfs2/ops_file.c
+++ b/fs/gfs2/ops_file.c
@@ -134,7 +134,6 @@ static const u32 fsflags_to_gfs2[32] = {
[7] = GFS2_DIF_NOATIME,
[12] = GFS2_DIF_EXHASH,
[14] = GFS2_DIF_INHERIT_JDATA,
- [20] = GFS2_DIF_INHERIT_DIRECTIO,
};
static const u32 gfs2_to_fsflags[32] = {
@@ -143,7 +142,6 @@ static const u32 gfs2_to_fsflags[32] = {
[gfs2fl_AppendOnly] = FS_APPEND_FL,
[gfs2fl_NoAtime] = FS_NOATIME_FL,
[gfs2fl_ExHash] = FS_INDEX_FL,
- [gfs2fl_InheritDirectio] = FS_DIRECTIO_FL,
[gfs2fl_InheritJdata] = FS_JOURNAL_DATA_FL,
};
@@ -161,12 +159,8 @@ static int gfs2_get_flags(struct file *filp, u32 __user *ptr)
return error;
fsflags = fsflags_cvt(gfs2_to_fsflags, ip->i_di.di_flags);
- if (!S_ISDIR(inode->i_mode)) {
- if (ip->i_di.di_flags & GFS2_DIF_JDATA)
- fsflags |= FS_JOURNAL_DATA_FL;
- if (ip->i_di.di_flags & GFS2_DIF_DIRECTIO)
- fsflags |= FS_DIRECTIO_FL;
- }
+ if (!S_ISDIR(inode->i_mode) && ip->i_di.di_flags & GFS2_DIF_JDATA)
+ fsflags |= FS_JOURNAL_DATA_FL;
if (put_user(fsflags, ptr))
error = -EFAULT;
@@ -195,13 +189,11 @@ void gfs2_set_inode_flags(struct inode *inode)
/* Flags that can be set by user space */
#define GFS2_FLAGS_USER_SET (GFS2_DIF_JDATA| \
- GFS2_DIF_DIRECTIO| \
GFS2_DIF_IMMUTABLE| \
GFS2_DIF_APPENDONLY| \
GFS2_DIF_NOATIME| \
GFS2_DIF_SYNC| \
GFS2_DIF_SYSTEM| \
- GFS2_DIF_INHERIT_DIRECTIO| \
GFS2_DIF_INHERIT_JDATA)
/**
@@ -292,8 +284,6 @@ static int gfs2_set_flags(struct file *filp, u32 __user *ptr)
if (!S_ISDIR(inode->i_mode)) {
if (gfsflags & GFS2_DIF_INHERIT_JDATA)
gfsflags ^= (GFS2_DIF_JDATA | GFS2_DIF_INHERIT_JDATA);
- if (gfsflags & GFS2_DIF_INHERIT_DIRECTIO)
- gfsflags ^= (GFS2_DIF_DIRECTIO | GFS2_DIF_INHERIT_DIRECTIO);
return do_gfs2_set_flags(filp, gfsflags, ~0);
}
return do_gfs2_set_flags(filp, gfsflags, ~GFS2_DIF_JDATA);
@@ -494,11 +484,6 @@ static int gfs2_open(struct inode *inode, struct file *file)
goto fail_gunlock;
}
- /* Listen to the Direct I/O flag */
-
- if (ip->i_di.di_flags & GFS2_DIF_DIRECTIO)
- file->f_flags |= O_DIRECT;
-
gfs2_glock_dq_uninit(&i_gh);
}
diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index 12fe38f..63a8a90 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -65,7 +65,6 @@ void gfs2_tune_init(struct gfs2_tune *gt)
gt->gt_quota_quantum = 60;
gt->gt_atime_quantum = 3600;
gt->gt_new_files_jdata = 0;
- gt->gt_new_files_directio = 0;
gt->gt_max_readahead = 1 << 18;
gt->gt_stall_secs = 600;
gt->gt_complain_secs = 10;
diff --git a/fs/gfs2/sys.c b/fs/gfs2/sys.c
index 6f7e2e5..7484655 100644
--- a/fs/gfs2/sys.c
+++ b/fs/gfs2/sys.c
@@ -412,7 +412,6 @@ TUNE_ATTR(max_readahead, 0);
TUNE_ATTR(complain_secs, 0);
TUNE_ATTR(statfs_slow, 0);
TUNE_ATTR(new_files_jdata, 0);
-TUNE_ATTR(new_files_directio, 0);
TUNE_ATTR(quota_simul_sync, 1);
TUNE_ATTR(quota_cache_secs, 1);
TUNE_ATTR(stall_secs, 1);
@@ -441,7 +440,6 @@ static struct attribute *tune_attrs[] = {
&tune_attr_quotad_secs.attr,
&tune_attr_quota_scale.attr,
&tune_attr_new_files_jdata.attr,
- &tune_attr_new_files_directio.attr,
NULL,
};
--
1.5.1.2
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [Cluster-devel] [PATCH 18/18] [GFS2] Remove unused declaration
2008-07-11 10:11 ` [Cluster-devel] [PATCH 17/18] [GFS2] Remove support for unused and pointless flag swhiteho
@ 2008-07-11 10:11 ` swhiteho
[not found] ` <Pine.LNX.4.61.0807110714110.18083@chaos.analogic.com>
1 sibling, 0 replies; 35+ messages in thread
From: swhiteho @ 2008-07-11 10:11 UTC (permalink / raw)
To: cluster-devel.redhat.com
From: Li Xiaodong <lixd@cn.fujitsu.com>
The implementation of gfs2_inode_attr_in is removed.
So remove its declaration.
Signed-off-by: Li Xiaodong <lixd@cn.fujitsu.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
diff --git a/fs/gfs2/inode.h b/fs/gfs2/inode.h
index 04e9fef..6074c25 100644
--- a/fs/gfs2/inode.h
+++ b/fs/gfs2/inode.h
@@ -72,7 +72,6 @@ static inline void gfs2_inum_out(const struct gfs2_inode *ip,
}
-void gfs2_inode_attr_in(struct gfs2_inode *ip);
void gfs2_set_iop(struct inode *inode);
struct inode *gfs2_inode_lookup(struct super_block *sb, unsigned type,
u64 no_addr, u64 no_formal_ino,
--
1.5.1.2
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [Cluster-devel] Re: [PATCH 17/18] [GFS2] Remove support for unused and pointless flag
[not found] ` <Pine.LNX.4.61.0807110714110.18083@chaos.analogic.com>
@ 2008-07-11 11:52 ` Steven Whitehouse
0 siblings, 0 replies; 35+ messages in thread
From: Steven Whitehouse @ 2008-07-11 11:52 UTC (permalink / raw)
To: cluster-devel.redhat.com
Hi,
On Fri, 2008-07-11 at 07:19 -0400, linux-os (Dick Johnson) wrote:
> On Fri, 11 Jul 2008 swhiteho at redhat.com wrote:
>
> > From: Steven Whitehouse <swhiteho@redhat.com>
> >
> > The ability to mark files for direct i/o access when opened
> > normally is both unused and pointless, so this patch removes
> > support for that feature.
> [Snipped...]
>
> So linux is no longer going to support commercial databases?
> Oracle and others need the O_DIRECT attribute. Linux can
> probably igonore it, but such an open cannot fail or else
> Linux gets thrown out of the commercial enterprise when
> an upgrade disables an entire financial institution.
>
> Cheers,
> Dick Johnson
> Penguin : Linux version 2.6.22.1 on an i686 machine (5588.28 BogoMips).
> My book : http://www.AbominableFirebug.com/
> _
>
No, thats not what it means...., but perhaps I could have explained it
better. You can still use O_DIRECT flags in open syscalls in exactly the
same way as before. What this patch removes is the (never used, and IMHO
pointless) feature of being able to set a flag on the inode (via
chattr/setattr) to force open's to set O_DIRECT in every case, even when
the application didn't ask for it. The reason that its pointless is that
applications which don't otherwise support O_DIRECT are very unlikely to
supply suitably aligned buffers and I/O requests, so that overriding the
O_DIRECT flag for such applications will most likely result in an I/O
error.
GFS2 will still support O_DIRECT just the same as it has always done,
and this doesn't affect any other filesystem's support for that feature
either,
Steve.
^ permalink raw reply [flat|nested] 35+ messages in thread
* [Cluster-devel] GFS2: Pre-pull patch posting
@ 2008-09-26 12:00 Steven Whitehouse
0 siblings, 0 replies; 35+ messages in thread
From: Steven Whitehouse @ 2008-09-26 12:00 UTC (permalink / raw)
To: cluster-devel.redhat.com
I'm guessing that the merge window opening might not be too far
away now, and in any case, I won't have quite my normal internet
access next week. So I'm pushing out the current GFS2 tree so
that (I hope) there will be time to fix any issues.
Again, there are fewer patches here. A lot of them are fairly small
too. The most noteable item deals with the meta filesystem which
was in response to Al Viro's suggestions concerning a better way
to structure that code. It certainly results in a much cleaner
implementation, so thanks go to Al for pointing that out.
A couple of new features: I/O barrier support (needs no user
configuration, see the patch for details) and UUID support
(no code changes, its all userland but we reserve space in
the super block, again details in the patch itself).
I know that I hardly need say, but please let me know if you have any
comments :-)
Steve.
^ permalink raw reply [flat|nested] 35+ messages in thread
* [Cluster-devel] GFS2: Pre-pull patch posting
@ 2008-12-17 11:29 swhiteho
0 siblings, 0 replies; 35+ messages in thread
From: swhiteho @ 2008-12-17 11:29 UTC (permalink / raw)
To: cluster-devel.redhat.com
In preparation for the next merge window, here is the current content
of the GFS2 git tree. Firstly, we have one new feature, which is
support for the FIEMAP ioctl. That patch does touch some code
outside of GFS2 itself, but its the only patch in this series which
does so.
The remaining patches are mostly clean up and bug fixes, as usual.
They are working towards a point where I can submit a patch to finally
merge the lock_dlm module (not dlm itself I should emphasise) into
GFS2. That is a large patch and a preliminary version has already been
posted to cluster-devel. My plan is to put that patch into my -nmw
git tree as the first patch for the following merge window to give
it maximum exposure.
The other highlight of this patch series, is a patch which removes
the two daemons (gfs2_scand and gfs2_glockd) and replaces them with
a "shrinker" routine registered with the VM. As expected, this also
reduces the code size. We are also expecting do do a similar thing
with the GFS2 quota data strucutures at some point in the future.
Steve.
^ permalink raw reply [flat|nested] 35+ messages in thread
* [Cluster-devel] [GFS2] Pre-pull patch posting
@ 2009-03-18 12:23 swhiteho
0 siblings, 0 replies; 35+ messages in thread
From: swhiteho @ 2009-03-18 12:23 UTC (permalink / raw)
To: cluster-devel.redhat.com
Hi,
So as the merge window draws closer, here is the current content of the
GFS2 git tree. The major item this time is patch 5 in the series. This
contians by far the majority of the changes, and the majority of those
changes are actually removal of code. The patch merges the lock_dlm
module (not the dlm itself, but GFS2's interface to the dlm) into
GFS2 itself. This means that a number of optimisations are then
possible in terms of merging strucutures resulting in a considerable
saving in memory.
Since that patch is so large (I'm afraid that it really doesn't make
any sense to split it up) its been in the -nmw git tree for the whole
period since the last merge window and has also been posted for review
on cluster-devel on a number of occasions before that. We've run a
number of tests on it as well in that period, so I believe that its
pretty stable now. It certainly makes the code a lot cleaner and
easier to follow in that area.
The remainder of the patches are mostly bug fixes, but there are one
or two other interesting features, those being:
o GFS2 now supports the discard I/O requests for thin provisioning, etc
o A new "demote a glock" interface is added to sysfs to help in
testing GFS2
o With a new mkfs.gfs2 which writes UUIDs, the UUID is now included
in uevent messages (with older filesystems which don't have UUIDs,
we just don't send that information)
The GFS2 tracing patches which I posted a little while back are not
included in this patch set. I think I can see what I need to do in
order to avoid patching blktrace now, so my plan is to look at those
patches again after this merge window, and when all the queued patches
for the tracing subsystem have been merged.
As always, please let us know if you spot any issues in the patches,
Steve.
^ permalink raw reply [flat|nested] 35+ messages in thread
* [Cluster-devel] GFS2: Pre-pull patch posting
@ 2009-06-10 8:30 Steven Whitehouse
0 siblings, 0 replies; 35+ messages in thread
From: Steven Whitehouse @ 2009-06-10 8:30 UTC (permalink / raw)
To: cluster-devel.redhat.com
Hi,
As the merge window is more or less upon us, here is the content
of the GFS2 -nmw tree. There is nothing too startling this time
as the focus has very much been bug fixes and clean up.
We have a new mount option, commit= which does exactly the same
thing as the ext3 equivalent option. We have a long term plan to
make all the tunable parameters available as mount options and
thus to be able to eventually drop the sysfs interface to these
parameters.
Another long term plan is to get rid of the files named
ops_somethingorother and to either merge them into other
files, or rename them to not have the ops_ prefix. This
patch series makes a start on that, and does all the easy
ones. As a result some functions with only one caller are
moved to the same file as their caller and made static.
The docs are also updated to reflect the fact that the
lock_dlm interface module no longer exists and that interface is
now built into GFS2.
Steve.
^ permalink raw reply [flat|nested] 35+ messages in thread
* [Cluster-devel] GFS2: Pre-pull patch posting
@ 2009-09-10 11:27 Steven Whitehouse
0 siblings, 0 replies; 35+ messages in thread
From: Steven Whitehouse @ 2009-09-10 11:27 UTC (permalink / raw)
To: cluster-devel.redhat.com
As merge time is approaching, here is the current content of the
GFS2 -nmw git tree. I'm not expecting to take any more patches
now for the current merge window unless any last minute bugs
are discovered.
There is not a huge amount new this time. Some extra context for
uevent messages, better error handling during block allocation,
and a clean up of extended attribute support. There is still more
to do on the extended attribute side of things, but this is a good
start I think.
There are a few bug fixes as well. Once these patches are merged
I'm intending to start off the next -nmw tree with a patch to
remove some of the (now unused) sysfs files as per the message
on cluster-devel a few weeks back.
Steve.
^ permalink raw reply [flat|nested] 35+ messages in thread
* [Cluster-devel] GFS2: Pre-pull patch posting
@ 2010-03-01 15:08 Steven Whitehouse
0 siblings, 0 replies; 35+ messages in thread
From: Steven Whitehouse @ 2010-03-01 15:08 UTC (permalink / raw)
To: cluster-devel.redhat.com
Hi,
Not so many patches for GFS2 this merge window. The bulk of the changes are
aimed at reducing overheads when caching large numbers of inodes and
the consequent simplification of the umount code,
Steve.
^ permalink raw reply [flat|nested] 35+ messages in thread
* [Cluster-devel] GFS2: Pre-pull patch posting
@ 2010-03-11 17:21 Steven Whitehouse
0 siblings, 0 replies; 35+ messages in thread
From: Steven Whitehouse @ 2010-03-11 17:21 UTC (permalink / raw)
To: cluster-devel.redhat.com
Here are three small (but important!) fixes to GFS2.
Steve.
^ permalink raw reply [flat|nested] 35+ messages in thread
* [Cluster-devel] GFS2: Pre-pull patch posting
@ 2010-05-17 12:40 Steven Whitehouse
0 siblings, 0 replies; 35+ messages in thread
From: Steven Whitehouse @ 2010-05-17 12:40 UTC (permalink / raw)
To: cluster-devel.redhat.com
Hi,
Nothing very exciting this time.... mostly minor bug fixes and
a docs update. The gfs2_logd patch has been hanging around for
a long time and is now finally integrated. It is the first step
towards a longer term goal of improving performance in that
area,
Steve.
^ permalink raw reply [flat|nested] 35+ messages in thread
* [Cluster-devel] GFS2: Pre-pull patch posting
@ 2010-08-02 9:27 Steven Whitehouse
0 siblings, 0 replies; 35+ messages in thread
From: Steven Whitehouse @ 2010-08-02 9:27 UTC (permalink / raw)
To: cluster-devel.redhat.com
Hi,
Here is the current content of the GFS2 -nmw git tree. Mostly its
just clean up and bug fixes this time. There is one exception which
is the "wait for journal id" patch which is a new feature aimed
at (eventually) allowing us to simplify the userland support which
GFS2 requires,
Steve.
^ permalink raw reply [flat|nested] 35+ messages in thread
* [Cluster-devel] GFS2: Pre-pull patch posting
@ 2010-10-18 14:15 Steven Whitehouse
0 siblings, 0 replies; 35+ messages in thread
From: Steven Whitehouse @ 2010-10-18 14:15 UTC (permalink / raw)
To: cluster-devel.redhat.com
Hi,
I know the merge window isn't open yet, but at this stage I'm going to
hold off on any larger patches until the following merge window so this
patch set isn't likely to change much, hence kicking it out a bit early
for review.
There are a few interesting points to note in this patch set:
o GFS2 is updated to use the new truncate sequence
o Support for fallocate is added
o Clean up of some unused/obsolete mount options
I'm currently working on a patch to allow the glock hash table
to use RCU. That is currently a work-in-progress and that will
hopefully be ready for the succeeding merge window.
Steve.
^ permalink raw reply [flat|nested] 35+ messages in thread
* [Cluster-devel] GFS2: Pre-pull patch posting
@ 2013-01-03 11:50 Steven Whitehouse
0 siblings, 0 replies; 35+ messages in thread
From: Steven Whitehouse @ 2013-01-03 11:50 UTC (permalink / raw)
To: cluster-devel.redhat.com
Hi,
Here are four small bug fixes for GFS2. There is no common theme here
really, just a few items that were fixed recently. The first fixes
lock name generation when the glock number is 0. The second fixes a
race allocating reservation structures and the final two fix a performance
issue by making small changes in the allocation code,
Steve.
^ permalink raw reply [flat|nested] 35+ messages in thread
* [Cluster-devel] GFS2: Pre-pull patch posting
@ 2013-04-05 9:57 Steven Whitehouse
0 siblings, 0 replies; 35+ messages in thread
From: Steven Whitehouse @ 2013-04-05 9:57 UTC (permalink / raw)
To: cluster-devel.redhat.com
Hi,
Here are a few GFS2 fixes which are pending. There are two patches
which fix up a couple of minor issues in the DLM interface code,
a missing error path in gfs2_rs_alloc(), two patches which fix problems
during "withdraw" and a fix for discards/FITRIM when using 4k sector
sized devices,
Steve.
^ permalink raw reply [flat|nested] 35+ messages in thread
* [Cluster-devel] GFS2: Pre-pull patch posting
@ 2015-02-10 10:36 Steven Whitehouse
0 siblings, 0 replies; 35+ messages in thread
From: Steven Whitehouse @ 2015-02-10 10:36 UTC (permalink / raw)
To: cluster-devel.redhat.com
This time we have mostly clean ups. There is a bug fix for a NULL dereference
relating to ACLs, and another which improves (but does not fix entirely) an
allocation fall-back code path. The other three patches are small clean ups.
Steve.
^ permalink raw reply [flat|nested] 35+ messages in thread
end of thread, other threads:[~2015-02-10 10:36 UTC | newest]
Thread overview: 35+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-07-11 10:11 [Cluster-devel] [GFS2] Pre-pull patch posting swhiteho
2008-07-11 10:11 ` [Cluster-devel] [PATCH 01/18] [GFS2] Clean up the glock core swhiteho
2008-07-11 10:11 ` [Cluster-devel] [PATCH 02/18] [GFS2] Fix ordering bug in lock_dlm swhiteho
2008-07-11 10:11 ` [Cluster-devel] [PATCH 03/18] [GFS2] No lock_nolock swhiteho
2008-07-11 10:11 ` [Cluster-devel] [PATCH 04/18] [GFS2] trivial sparse lock annotations swhiteho
2008-07-11 10:11 ` [Cluster-devel] [PATCH 05/18] [GFS2] Fix ordering of args for list_add swhiteho
2008-07-11 10:11 ` [Cluster-devel] [PATCH 06/18] [GFS2] Revise readpage locking swhiteho
2008-07-11 10:11 ` [Cluster-devel] [PATCH 07/18] [GFS2] kernel panic mounting volume swhiteho
2008-07-11 10:11 ` [Cluster-devel] [PATCH 08/18] [GFS2] Remove remote lock dropping code swhiteho
2008-07-11 10:11 ` [Cluster-devel] [PATCH 09/18] [GFS2] Remove obsolete conversion deadlock avoidance code swhiteho
2008-07-11 10:11 ` [Cluster-devel] [PATCH 10/18] [GFS2] Remove all_list from lock_dlm swhiteho
2008-07-11 10:11 ` [Cluster-devel] [PATCH 11/18] [GFS2] Glock documentation swhiteho
2008-07-11 10:11 ` [Cluster-devel] [PATCH 12/18] [GFS2] Fix module building swhiteho
2008-07-11 10:11 ` [Cluster-devel] [PATCH 13/18] [GFS2] don't call permission() swhiteho
2008-07-11 10:11 ` [Cluster-devel] [PATCH 14/18] [GFS2] Fix delayed demote race swhiteho
2008-07-11 10:11 ` [Cluster-devel] [PATCH 15/18] [GFS2] Allow local DF locks when holding a cached EX glock swhiteho
2008-07-11 10:11 ` [Cluster-devel] [PATCH 16/18] [GFS2] Replace rgrp "recent list" with mru list swhiteho
2008-07-11 10:11 ` [Cluster-devel] [PATCH 17/18] [GFS2] Remove support for unused and pointless flag swhiteho
2008-07-11 10:11 ` [Cluster-devel] [PATCH 18/18] [GFS2] Remove unused declaration swhiteho
[not found] ` <Pine.LNX.4.61.0807110714110.18083@chaos.analogic.com>
2008-07-11 11:52 ` [Cluster-devel] Re: [PATCH 17/18] [GFS2] Remove support for unused and pointless flag Steven Whitehouse
-- strict thread matches above, loose matches on Subject: below --
2015-02-10 10:36 [Cluster-devel] GFS2: Pre-pull patch posting Steven Whitehouse
2013-04-05 9:57 Steven Whitehouse
2013-01-03 11:50 Steven Whitehouse
2010-10-18 14:15 Steven Whitehouse
2010-08-02 9:27 Steven Whitehouse
2010-05-17 12:40 Steven Whitehouse
2010-03-11 17:21 Steven Whitehouse
2010-03-01 15:08 Steven Whitehouse
2009-09-10 11:27 Steven Whitehouse
2009-06-10 8:30 Steven Whitehouse
2009-03-18 12:23 [Cluster-devel] [GFS2] " swhiteho
2008-12-17 11:29 [Cluster-devel] GFS2: " swhiteho
2008-09-26 12:00 Steven Whitehouse
2008-04-17 8:37 [Cluster-devel] [GFS2] " swhiteho
2008-01-21 9:21 swhiteho
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).