* [PATCH] Remove l_flushsema
@ 2008-04-30 9:05 Matthew Wilcox
2008-04-30 10:24 ` Matthew Wilcox
2008-04-30 10:41 ` David Chinner
0 siblings, 2 replies; 12+ messages in thread
From: Matthew Wilcox @ 2008-04-30 9:05 UTC (permalink / raw)
To: David Chinner; +Cc: xfs, linux-fsdevel
The l_flushsema doesn't exactly have completion semantics, nor mutex
semantics. It's used as a list of tasks which are waiting to be notified
that a flush has completed. It was also being used in a way that was
potentially racy, depending on the semaphore implementation.
By using a waitqueue instead of a semaphore we avoid the need for a
separate counter, since we know we just need to wake everything on the
queue.
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
--
I've only given this light testing, it could use some more.
fs/xfs/xfs_log.c | 19 +++++++------------
fs/xfs/xfs_log_priv.h | 6 ++----
2 files changed, 9 insertions(+), 16 deletions(-)
diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
index afaee30..d2e3092 100644
--- a/fs/xfs/xfs_log.c
+++ b/fs/xfs/xfs_log.c
@@ -1228,7 +1228,7 @@ xlog_alloc_log(xfs_mount_t *mp,
spin_lock_init(&log->l_icloglock);
spin_lock_init(&log->l_grant_lock);
- initnsema(&log->l_flushsema, 0, "ic-flush");
+ init_waitqueue_head(&log->l_flush_wq);
/* log record size must be multiple of BBSIZE; see xlog_rec_header_t */
ASSERT((XFS_BUF_SIZE(bp) & BBMASK) == 0);
@@ -1573,7 +1573,6 @@ xlog_dealloc_log(xlog_t *log)
kmem_free(iclog, sizeof(xlog_in_core_t));
iclog = next_iclog;
}
- freesema(&log->l_flushsema);
spinlock_destroy(&log->l_icloglock);
spinlock_destroy(&log->l_grant_lock);
@@ -2278,14 +2277,9 @@ xlog_state_do_callback(
}
#endif
- flushcnt = 0;
- if (log->l_iclog->ic_state & (XLOG_STATE_ACTIVE|XLOG_STATE_IOERROR)) {
- flushcnt = log->l_flushcnt;
- log->l_flushcnt = 0;
- }
+ if (log->l_iclog->ic_state & (XLOG_STATE_ACTIVE|XLOG_STATE_IOERROR))
+ wake_up_all(&log->l_flush_wq);
spin_unlock(&log->l_icloglock);
- while (flushcnt--)
- vsema(&log->l_flushsema);
} /* xlog_state_do_callback */
@@ -2385,12 +2379,13 @@ restart:
iclog = log->l_iclog;
if (! (iclog->ic_state == XLOG_STATE_ACTIVE)) {
- log->l_flushcnt++;
+ DEFINE_WAIT(wait);
+ prepare_to_wait(&log->l_flush_wq, &wait, TASK_UNINTERRUPTIBLE);
spin_unlock(&log->l_icloglock);
xlog_trace_iclog(iclog, XLOG_TRACE_SLEEP_FLUSH);
XFS_STATS_INC(xs_log_noiclogs);
- /* Ensure that log writes happen */
- psema(&log->l_flushsema, PINOD);
+ /* Wait for log writes to have flushed */
+ schedule();
goto restart;
}
ASSERT(iclog->ic_state == XLOG_STATE_ACTIVE);
diff --git a/fs/xfs/xfs_log_priv.h b/fs/xfs/xfs_log_priv.h
index 8952a39..a6dff16 100644
--- a/fs/xfs/xfs_log_priv.h
+++ b/fs/xfs/xfs_log_priv.h
@@ -423,10 +423,8 @@ typedef struct log {
int l_logBBsize; /* size of log in BB chunks */
/* The following block of fields are changed while holding icloglock */
- sema_t l_flushsema ____cacheline_aligned_in_smp;
- /* iclog flushing semaphore */
- int l_flushcnt; /* # of procs waiting on this
- * sema */
+ wait_queue_head_t l_flush_wq ____cacheline_aligned_in_smp;
+ /* waiting for iclog flush */
int l_covered_state;/* state of "covering disk
* log entries" */
xlog_in_core_t *l_iclog; /* head log queue */
--
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH] Remove l_flushsema
2008-04-30 9:05 [PATCH] Remove l_flushsema Matthew Wilcox
@ 2008-04-30 10:24 ` Matthew Wilcox
2008-04-30 10:41 ` David Chinner
1 sibling, 0 replies; 12+ messages in thread
From: Matthew Wilcox @ 2008-04-30 10:24 UTC (permalink / raw)
To: David Chinner; +Cc: xfs, linux-fsdevel
On Wed, Apr 30, 2008 at 03:05:03AM -0600, Matthew Wilcox wrote:
> @@ -2385,12 +2379,13 @@ restart:
>
> iclog = log->l_iclog;
> if (! (iclog->ic_state == XLOG_STATE_ACTIVE)) {
> - log->l_flushcnt++;
> + DEFINE_WAIT(wait);
> + prepare_to_wait(&log->l_flush_wq, &wait, TASK_UNINTERRUPTIBLE);
> spin_unlock(&log->l_icloglock);
> xlog_trace_iclog(iclog, XLOG_TRACE_SLEEP_FLUSH);
> XFS_STATS_INC(xs_log_noiclogs);
> - /* Ensure that log writes happen */
> - psema(&log->l_flushsema, PINOD);
> + /* Wait for log writes to have flushed */
> + schedule();
> goto restart;
> }
> ASSERT(iclog->ic_state == XLOG_STATE_ACTIVE);
Christoph points out that this is missing a call to finish_wait() after
schedule() and he's absolutely right. It only matters if the task was
woken up by something other than being on this wait queue, so my testing
didn't notice it missing.
--
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] Remove l_flushsema
2008-04-30 9:05 [PATCH] Remove l_flushsema Matthew Wilcox
2008-04-30 10:24 ` Matthew Wilcox
@ 2008-04-30 10:41 ` David Chinner
2008-04-30 10:58 ` Christoph Hellwig
1 sibling, 1 reply; 12+ messages in thread
From: David Chinner @ 2008-04-30 10:41 UTC (permalink / raw)
To: Matthew Wilcox; +Cc: David Chinner, xfs, linux-fsdevel
On Wed, Apr 30, 2008 at 03:05:03AM -0600, Matthew Wilcox wrote:
>
> The l_flushsema doesn't exactly have completion semantics, nor mutex
> semantics. It's used as a list of tasks which are waiting to be notified
> that a flush has completed. It was also being used in a way that was
> potentially racy, depending on the semaphore implementation.
>
> By using a waitqueue instead of a semaphore we avoid the need for a
> separate counter, since we know we just need to wake everything on the
> queue.
Looks good at first glance. thanks for doing this, Matthew.
I've been swamped the last couple of days so I haven't had
a chance to do this myself....
> Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
>
> --
>
> I've only given this light testing, it could use some more.
Yeah, I've pulled it into my qa tree so it'll get some shaking down.
If it survives for a while, I'll push it into the xfs tree.
One comment, though:
> @@ -2278,14 +2277,9 @@ xlog_state_do_callback(
> }
> #endif
>
> - flushcnt = 0;
> - if (log->l_iclog->ic_state & (XLOG_STATE_ACTIVE|XLOG_STATE_IOERROR)) {
> - flushcnt = log->l_flushcnt;
> - log->l_flushcnt = 0;
> - }
> + if (log->l_iclog->ic_state & (XLOG_STATE_ACTIVE|XLOG_STATE_IOERROR))
> + wake_up_all(&log->l_flush_wq);
> spin_unlock(&log->l_icloglock);
> - while (flushcnt--)
> - vsema(&log->l_flushsema);
The only thing that I'm concerned about here is that this will
substantially increase the time the l_icloglock is held. This is
a severely contended lock on large cpu count machines and putting
the wakeup inside this lock will increase the hold time.
I guess I can address this by adding a new lock for the waitqueue
in a separate patch set.
Hmmm - CONFIG_XFS_DEBUG builds break in the xfs-dev tree with
this patch (in the xfs kdb module). I'll fix this up as well.
Cheers,
Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] Remove l_flushsema
2008-04-30 10:41 ` David Chinner
@ 2008-04-30 10:58 ` Christoph Hellwig
2008-04-30 11:11 ` David Chinner
0 siblings, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2008-04-30 10:58 UTC (permalink / raw)
To: David Chinner; +Cc: Matthew Wilcox, xfs, linux-fsdevel
On Wed, Apr 30, 2008 at 08:41:25PM +1000, David Chinner wrote:
> The only thing that I'm concerned about here is that this will
> substantially increase the time the l_icloglock is held. This is
> a severely contended lock on large cpu count machines and putting
> the wakeup inside this lock will increase the hold time.
>
> I guess I can address this by adding a new lock for the waitqueue
> in a separate patch set.
waitqueues are loked internally and don't need synchronization. With
a little bit of re-arranging the code the wake_up could probably be
moved out of the critical section.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] Remove l_flushsema
2008-04-30 10:58 ` Christoph Hellwig
@ 2008-04-30 11:11 ` David Chinner
2008-04-30 11:15 ` Christoph Hellwig
2008-04-30 11:52 ` Matthew Wilcox
0 siblings, 2 replies; 12+ messages in thread
From: David Chinner @ 2008-04-30 11:11 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: David Chinner, Matthew Wilcox, xfs, linux-fsdevel
On Wed, Apr 30, 2008 at 06:58:32AM -0400, Christoph Hellwig wrote:
> On Wed, Apr 30, 2008 at 08:41:25PM +1000, David Chinner wrote:
> > The only thing that I'm concerned about here is that this will
> > substantially increase the time the l_icloglock is held. This is
> > a severely contended lock on large cpu count machines and putting
> > the wakeup inside this lock will increase the hold time.
> >
> > I guess I can address this by adding a new lock for the waitqueue
> > in a separate patch set.
>
> waitqueues are loked internally and don't need synchronization. With
> a little bit of re-arranging the code the wake_up could probably be
> moved out of the critical section.
Yeah, I just realised that myself and was about to reply as such....
I'll move the wakeup outside the lock.
Cheers,
Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] Remove l_flushsema
2008-04-30 11:11 ` David Chinner
@ 2008-04-30 11:15 ` Christoph Hellwig
2008-04-30 11:34 ` David Chinner
2008-04-30 11:52 ` Matthew Wilcox
1 sibling, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2008-04-30 11:15 UTC (permalink / raw)
To: David Chinner; +Cc: Christoph Hellwig, Matthew Wilcox, xfs
On Wed, Apr 30, 2008 at 09:11:54PM +1000, David Chinner wrote:
> > waitqueues are loked internally and don't need synchronization. With
> > a little bit of re-arranging the code the wake_up could probably be
> > moved out of the critical section.
>
> Yeah, I just realised that myself and was about to reply as such....
>
> I'll move the wakeup outside the lock.
Below is the version I have now. One of the rare cases where using
sv_t actually cleans up the code (althoug the whole sv_ family should
probably loose some arguments).
Index: linux-2.6-xfs/fs/xfs/xfs_log.c
===================================================================
--- linux-2.6-xfs.orig/fs/xfs/xfs_log.c 2008-04-25 20:11:58.000000000 +0200
+++ linux-2.6-xfs/fs/xfs/xfs_log.c 2008-04-30 13:13:48.000000000 +0200
@@ -1228,7 +1228,7 @@ xlog_alloc_log(xfs_mount_t *mp,
spin_lock_init(&log->l_icloglock);
spin_lock_init(&log->l_grant_lock);
- initnsema(&log->l_flushsema, 0, "ic-flush");
+ sv_init(&log->l_flush_wait, 0, "flush_wait");
/* log record size must be multiple of BBSIZE; see xlog_rec_header_t */
ASSERT((XFS_BUF_SIZE(bp) & BBMASK) == 0);
@@ -1573,7 +1573,6 @@ xlog_dealloc_log(xlog_t *log)
kmem_free(iclog, sizeof(xlog_in_core_t));
iclog = next_iclog;
}
- freesema(&log->l_flushsema);
spinlock_destroy(&log->l_icloglock);
spinlock_destroy(&log->l_grant_lock);
@@ -2097,6 +2096,7 @@ xlog_state_do_callback(
int funcdidcallbacks; /* flag: function did callbacks */
int repeats; /* for issuing console warnings if
* looping too many times */
+ int wake = 0;
spin_lock(&log->l_icloglock);
first_iclog = iclog = log->l_iclog;
@@ -2278,15 +2278,13 @@ xlog_state_do_callback(
}
#endif
- flushcnt = 0;
- if (log->l_iclog->ic_state & (XLOG_STATE_ACTIVE|XLOG_STATE_IOERROR)) {
- flushcnt = log->l_flushcnt;
- log->l_flushcnt = 0;
- }
+ if (log->l_iclog->ic_state & (XLOG_STATE_ACTIVE|XLOG_STATE_IOERROR))
+ wake = 1;
spin_unlock(&log->l_icloglock);
- while (flushcnt--)
- vsema(&log->l_flushsema);
-} /* xlog_state_do_callback */
+
+ if (wake)
+ sv_broadcast(&log->l_flush_wait);
+}
/*
@@ -2384,16 +2382,15 @@ restart:
}
iclog = log->l_iclog;
- if (! (iclog->ic_state == XLOG_STATE_ACTIVE)) {
- log->l_flushcnt++;
- spin_unlock(&log->l_icloglock);
+ if (iclog->ic_state != XLOG_STATE_ACTIVE) {
xlog_trace_iclog(iclog, XLOG_TRACE_SLEEP_FLUSH);
XFS_STATS_INC(xs_log_noiclogs);
- /* Ensure that log writes happen */
- psema(&log->l_flushsema, PINOD);
+
+ /* Wait for log writes to have flushed */
+ sv_wait(&log->l_flush_wait, 0, &log->l_icloglock, 0);
goto restart;
}
- ASSERT(iclog->ic_state == XLOG_STATE_ACTIVE);
+
head = &iclog->ic_header;
atomic_inc(&iclog->ic_refcnt); /* prevents sync */
Index: linux-2.6-xfs/fs/xfs/xfs_log_priv.h
===================================================================
--- linux-2.6-xfs.orig/fs/xfs/xfs_log_priv.h 2008-04-25 20:11:58.000000000 +0200
+++ linux-2.6-xfs/fs/xfs/xfs_log_priv.h 2008-04-30 13:09:33.000000000 +0200
@@ -423,10 +423,8 @@ typedef struct log {
int l_logBBsize; /* size of log in BB chunks */
/* The following block of fields are changed while holding icloglock */
- sema_t l_flushsema ____cacheline_aligned_in_smp;
- /* iclog flushing semaphore */
- int l_flushcnt; /* # of procs waiting on this
- * sema */
+ sv_t l_flush_wait ____cacheline_aligned_in_smp;
+ /* waiting for iclog flush */
int l_covered_state;/* state of "covering disk
* log entries" */
xlog_in_core_t *l_iclog; /* head log queue */
Index: linux-2.6-xfs/fs/xfs/xfsidbg.c
===================================================================
--- linux-2.6-xfs.orig/fs/xfs/xfsidbg.c 2008-04-30 13:09:58.000000000 +0200
+++ linux-2.6-xfs/fs/xfs/xfsidbg.c 2008-04-30 13:10:26.000000000 +0200
@@ -5829,8 +5829,8 @@ xfsidbg_xlog(xlog_t *log)
};
kdb_printf("xlog at 0x%p\n", log);
- kdb_printf("&flushsm: 0x%p flushcnt: %d ICLOG: 0x%p \n",
- &log->l_flushsema, log->l_flushcnt, log->l_iclog);
+ kdb_printf("&flush_wait: 0x%p ICLOG: 0x%p \n",
+ &log->l_flush_wait, log->l_iclog);
kdb_printf("&icloglock: 0x%p tail_lsn: %s last_sync_lsn: %s \n",
&log->l_icloglock, xfs_fmtlsn(&log->l_tail_lsn),
xfs_fmtlsn(&log->l_last_sync_lsn));
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] Remove l_flushsema
2008-04-30 11:15 ` Christoph Hellwig
@ 2008-04-30 11:34 ` David Chinner
2008-04-30 11:37 ` Christoph Hellwig
0 siblings, 1 reply; 12+ messages in thread
From: David Chinner @ 2008-04-30 11:34 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: David Chinner, Matthew Wilcox, xfs
On Wed, Apr 30, 2008 at 07:15:21AM -0400, Christoph Hellwig wrote:
> On Wed, Apr 30, 2008 at 09:11:54PM +1000, David Chinner wrote:
> > > waitqueues are loked internally and don't need synchronization. With
> > > a little bit of re-arranging the code the wake_up could probably be
> > > moved out of the critical section.
> >
> > Yeah, I just realised that myself and was about to reply as such....
> >
> > I'll move the wakeup outside the lock.
>
> Below is the version I have now. One of the rare cases where using
> sv_t actually cleans up the code (althoug the whole sv_ family should
> probably loose some arguments).
Yep, much cleaner. Who's signoff goes on this?
Cheers,
Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] Remove l_flushsema
2008-04-30 11:34 ` David Chinner
@ 2008-04-30 11:37 ` Christoph Hellwig
2008-04-30 15:17 ` Matthew Wilcox
0 siblings, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2008-04-30 11:37 UTC (permalink / raw)
To: David Chinner; +Cc: Christoph Hellwig, Matthew Wilcox, xfs
On Wed, Apr 30, 2008 at 09:34:18PM +1000, David Chinner wrote:
> > probably loose some arguments).
>
> Yep, much cleaner. Who's signoff goes on this?
You can have mine:
Signed-off-by: Christoph Hellwig <hch@lst.de>
but I think it's till essentially willy's and he should be credited for
it.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] Remove l_flushsema
2008-04-30 11:11 ` David Chinner
2008-04-30 11:15 ` Christoph Hellwig
@ 2008-04-30 11:52 ` Matthew Wilcox
2008-04-30 12:14 ` David Chinner
1 sibling, 1 reply; 12+ messages in thread
From: Matthew Wilcox @ 2008-04-30 11:52 UTC (permalink / raw)
To: David Chinner; +Cc: Christoph Hellwig, xfs, linux-fsdevel
On Wed, Apr 30, 2008 at 09:11:54PM +1000, David Chinner wrote:
> On Wed, Apr 30, 2008 at 06:58:32AM -0400, Christoph Hellwig wrote:
> > On Wed, Apr 30, 2008 at 08:41:25PM +1000, David Chinner wrote:
> > > The only thing that I'm concerned about here is that this will
> > > substantially increase the time the l_icloglock is held. This is
> > > a severely contended lock on large cpu count machines and putting
> > > the wakeup inside this lock will increase the hold time.
> > >
> > > I guess I can address this by adding a new lock for the waitqueue
> > > in a separate patch set.
> >
> > waitqueues are loked internally and don't need synchronization. With
> > a little bit of re-arranging the code the wake_up could probably be
> > moved out of the critical section.
>
> Yeah, I just realised that myself and was about to reply as such....
>
> I'll move the wakeup outside the lock.
I can't tell whether this race matters ... probably not:
N processes come in and queue up waiting for the flush
xlog_state_do_callback() is called
it unlocks the spinlock
a new task comes in and takes the spinlock
wakeups happen
ie do we care about 'fairness' here, or is it OK for a new task to jump
the queue?
--
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] Remove l_flushsema
2008-04-30 11:52 ` Matthew Wilcox
@ 2008-04-30 12:14 ` David Chinner
0 siblings, 0 replies; 12+ messages in thread
From: David Chinner @ 2008-04-30 12:14 UTC (permalink / raw)
To: Matthew Wilcox; +Cc: David Chinner, Christoph Hellwig, xfs, linux-fsdevel
On Wed, Apr 30, 2008 at 05:52:53AM -0600, Matthew Wilcox wrote:
> On Wed, Apr 30, 2008 at 09:11:54PM +1000, David Chinner wrote:
> > On Wed, Apr 30, 2008 at 06:58:32AM -0400, Christoph Hellwig wrote:
> > > On Wed, Apr 30, 2008 at 08:41:25PM +1000, David Chinner wrote:
> > > > The only thing that I'm concerned about here is that this will
> > > > substantially increase the time the l_icloglock is held. This is
> > > > a severely contended lock on large cpu count machines and putting
> > > > the wakeup inside this lock will increase the hold time.
> > > >
> > > > I guess I can address this by adding a new lock for the waitqueue
> > > > in a separate patch set.
> > >
> > > waitqueues are loked internally and don't need synchronization. With
> > > a little bit of re-arranging the code the wake_up could probably be
> > > moved out of the critical section.
> >
> > Yeah, I just realised that myself and was about to reply as such....
> >
> > I'll move the wakeup outside the lock.
>
> I can't tell whether this race matters ... probably not:
>
> N processes come in and queue up waiting for the flush
> xlog_state_do_callback() is called
> it unlocks the spinlock
> a new task comes in and takes the spinlock
> wakeups happen
>
> ie do we care about 'fairness' here, or is it OK for a new task to jump
> the queue?
This has always been a possibility here. However, this deep inside the log code
I don't think it really matters because the waiters have log space
reservations. In overload conditions, fairness is handled when obtaining
a reservation via a ordered ticket queue (see xlog_grant_log_space()).
Thundering herds tend to be thinned to smaller bursts by this queue, too...
Cheers,
Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] Remove l_flushsema
2008-04-30 11:37 ` Christoph Hellwig
@ 2008-04-30 15:17 ` Matthew Wilcox
2008-05-01 1:19 ` David Chinner
0 siblings, 1 reply; 12+ messages in thread
From: Matthew Wilcox @ 2008-04-30 15:17 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: David Chinner, xfs
On Wed, Apr 30, 2008 at 07:37:53AM -0400, Christoph Hellwig wrote:
> On Wed, Apr 30, 2008 at 09:34:18PM +1000, David Chinner wrote:
> > > probably loose some arguments).
> >
> > Yep, much cleaner. Who's signoff goes on this?
>
> You can have mine:
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
>
> but I think it's till essentially willy's and he should be credited for
> it.
I'm fine with adding my S-o-B to this version:
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Here's a little twist on the idea to avoid the thundering herd.
A vigorous review of this might not be a bad idea -- the idea is to only
wake up sleeping processes when there seems to be enough space in the
log to make it worthwhile. So there's a few places where we unlock the
l_icloglock and jump back to restart; I didn't add an sv_signal there.
But there should be an sv_signal before each exit from the function,
and I think I've done that.
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
index 1bfe3f9..4533a10 100644
--- a/fs/xfs/xfs_log.c
+++ b/fs/xfs/xfs_log.c
@@ -2282,7 +2282,7 @@ xlog_state_do_callback(
if (log->l_iclog->ic_state & (XLOG_STATE_ACTIVE|XLOG_STATE_IOERROR))
wait = 1;
spin_unlock(&log->l_icloglock);
- sv_broadcast(&log->l_flush_wait);
+ sv_signal(&log->l_flush_wait);
} /* xlog_state_do_callback */
@@ -2377,6 +2377,7 @@ restart:
spin_lock(&log->l_icloglock);
if (XLOG_FORCED_SHUTDOWN(log)) {
spin_unlock(&log->l_icloglock);
+ sv_signal(&log->l_flush_wait);
return XFS_ERROR(EIO);
}
@@ -2425,8 +2426,11 @@ restart:
/* If I'm the only one writing to this iclog, sync it to disk */
if (atomic_read(&iclog->ic_refcnt) == 1) {
spin_unlock(&log->l_icloglock);
- if ((error = xlog_state_release_iclog(log, iclog)))
+ error = xlog_state_release_iclog(log, iclog);
+ if (error) {
+ sv_signal(&log->l_flush_wait);
return error;
+ }
} else {
atomic_dec(&iclog->ic_refcnt);
spin_unlock(&log->l_icloglock);
@@ -2452,6 +2456,7 @@ restart:
ASSERT(iclog->ic_offset <= iclog->ic_size);
spin_unlock(&log->l_icloglock);
+ sv_signal(&log->l_flush_wait);
*logoffsetp = log_offset;
return 0;
--
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH] Remove l_flushsema
2008-04-30 15:17 ` Matthew Wilcox
@ 2008-05-01 1:19 ` David Chinner
0 siblings, 0 replies; 12+ messages in thread
From: David Chinner @ 2008-05-01 1:19 UTC (permalink / raw)
To: Matthew Wilcox; +Cc: Christoph Hellwig, David Chinner, xfs
On Wed, Apr 30, 2008 at 09:17:14AM -0600, Matthew Wilcox wrote:
> On Wed, Apr 30, 2008 at 07:37:53AM -0400, Christoph Hellwig wrote:
> > On Wed, Apr 30, 2008 at 09:34:18PM +1000, David Chinner wrote:
> > > > probably loose some arguments).
> > >
> > > Yep, much cleaner. Who's signoff goes on this?
> >
> > You can have mine:
> >
> > Signed-off-by: Christoph Hellwig <hch@lst.de>
> >
> > but I think it's till essentially willy's and he should be credited for
> > it.
>
> I'm fine with adding my S-o-B to this version:
>
> Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
>
> Here's a little twist on the idea to avoid the thundering herd.
> A vigorous review of this might not be a bad idea -- the idea is to only
> wake up sleeping processes when there seems to be enough space in the
> log to make it worthwhile. So there's a few places where we unlock the
> l_icloglock and jump back to restart; I didn't add an sv_signal there.
> But there should be an sv_signal before each exit from the function,
> and I think I've done that.
That might work. I'll have to look at it more detail later and do
some performance testing when I'm not so busy with other stuff.
FWIW, in all the error or shutdown cases, it may as well be a broadcast
as every subsequent process through this code will get the same error.
i.e. once a log error occurs, the filesystem gets shut down....
Cheers,
Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2008-05-01 1:19 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-04-30 9:05 [PATCH] Remove l_flushsema Matthew Wilcox
2008-04-30 10:24 ` Matthew Wilcox
2008-04-30 10:41 ` David Chinner
2008-04-30 10:58 ` Christoph Hellwig
2008-04-30 11:11 ` David Chinner
2008-04-30 11:15 ` Christoph Hellwig
2008-04-30 11:34 ` David Chinner
2008-04-30 11:37 ` Christoph Hellwig
2008-04-30 15:17 ` Matthew Wilcox
2008-05-01 1:19 ` David Chinner
2008-04-30 11:52 ` Matthew Wilcox
2008-04-30 12:14 ` David Chinner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox