* xfs: grant lock scaling and removal V3
@ 2010-12-13 4:44 Dave Chinner
2010-12-13 4:44 ` [PATCH 01/12] xfs: convert log grant ticket queues to list heads Dave Chinner
` (11 more replies)
0 siblings, 12 replies; 24+ messages in thread
From: Dave Chinner @ 2010-12-13 4:44 UTC (permalink / raw)
To: xfs
This series addresses the log grant lock contention seen by 8-way
fs_mark workloads. The log grant lock protects:
- reserve grant head and wait queue
- write grant head and wait queue
- tail lsn
- last sync lsn
While they are all currently protected by a single lock, there is no
reason that they need to all be under the same lock. As a result,
one option for scaling was simply to split the grant lock into three
locks - one for each of the above groups. However, this would mean
that we'd need to nest locks inside each other and it ignores the
fact that we really only use the lock on the tail and last sync lsn
variables to protect against concurrent updates.
Hence we can make the tail and last sync LSN variables independent
of the grant lock by making them atomic variables. This means that
when we are moving the tail of the log, we can avoid all locking
except when there are waiters queued on the grant heads.
Making the grant heads scale better is a bit more of a challenge.
Just replacing the grant lock with reserve and write grant locks
doesn't really help improve scalability because we'd still need to
take both locks in the hot xfs_log_reserve() path. To improve
scalability, we really need to make this path lock free.
The first steps aree to clean up some of the code. We convert the
ticket queues to use the common list_head infrastructure, factor out
some common debug code, refactor and rearrange the grant head
calculations code and convert all the users of the sv_t wait
mechanisms to use wait queues directly.
The second step to acheiving this is to encode the grant heads as a
64 bit variable and then convert it to an atomic variable. The
tail/last sync LSNs also get converted to atomic variables, and this
means we can read the grant heads without holding locks and that
allows tail pushing calculations and available log space
calculations to operate lock free.
The next step is to introduce a lock per grant queue that is used
exclusively to protect queue manpulations. With the use of
list_empty_careful() we can check whether the queue has waiters
without holding the queue lock. Hence in the case where the queues
are empty we do not need to take the queue locks in the fast path.
Finally, we need to make the grant head space calculations lockless.
With the grant heads already being atomic variables, we can change
the calculation algorithm to a lockless cmpxchg algorithm. This
means we no longer need any spinlocks in the transaction reserve
fast path and hence the scalability of this path should be
significantly improved.
There is one down side to this change - the xlog_verify_head() debug
code can no longer be reliably used to detect accounting problems in
the grant space allocations as it requires an atomic sample of both
grant heads. However, the tail verification and the
xlog_space_left() verification still works without problems, so we
still have some debug checking on the grant head locations.
Version 3:
- dropped cleanup of xlog_grant_log_space() and
xlog_regrant_log_write_space().
- split grant head aggregation into multiple patches
- split out xlog_verify_tail() function
- factor grant head calculations and drop wrappers
- combine grant heads and add wrappers to crack/combine
grant heads.
- removed intermediate grant head "_lsn" suffix name.
- folded all sv_t removal patches into one.
- don't pass tail and last sync lsn into xlog_grant_push_ail().
- ensure that shutdown checks in ticket queue processing are done
consistently before sleeping.
- removed xlog_grant_verify_head()
- folded grant lock removal into patch that converts grant head
manipulations to lockless algorithms.
- added a couple of tracepoints for when the log tail is moved and
queued tickets are woken to aid debugging.
Version 2:
- split into lots of patches
- clean up the code and comments
- add patches to clean up sv_t usage at the end of the series
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH 01/12] xfs: convert log grant ticket queues to list heads
2010-12-13 4:44 xfs: grant lock scaling and removal V3 Dave Chinner
@ 2010-12-13 4:44 ` Dave Chinner
2010-12-20 11:34 ` Christoph Hellwig
2010-12-13 4:44 ` [PATCH 02/12] xfs: fact out common grant head/log tail verification code Dave Chinner
` (10 subsequent siblings)
11 siblings, 1 reply; 24+ messages in thread
From: Dave Chinner @ 2010-12-13 4:44 UTC (permalink / raw)
To: xfs
From: Dave Chinner <dchinner@redhat.com>
The grant write and reserve queues use a roll-your-own double linked
list, so convert it to a standard list_head structure and convert
all the list traversals to use list_for_each_entry(). We can also
get rid of the XLOG_TIC_IN_Q flag as we can use the list_empty()
check to tell if the ticket is in a list or not.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
fs/xfs/linux-2.6/xfs_trace.h | 12 ++--
fs/xfs/xfs_log.c | 123 ++++++++++++++----------------------------
fs/xfs/xfs_log_priv.h | 11 ++---
3 files changed, 51 insertions(+), 95 deletions(-)
diff --git a/fs/xfs/linux-2.6/xfs_trace.h b/fs/xfs/linux-2.6/xfs_trace.h
index acef2e9..f400668 100644
--- a/fs/xfs/linux-2.6/xfs_trace.h
+++ b/fs/xfs/linux-2.6/xfs_trace.h
@@ -766,8 +766,8 @@ DECLARE_EVENT_CLASS(xfs_loggrant_class,
__field(int, curr_res)
__field(int, unit_res)
__field(unsigned int, flags)
- __field(void *, reserve_headq)
- __field(void *, write_headq)
+ __field(void *, reserveq)
+ __field(void *, writeq)
__field(int, grant_reserve_cycle)
__field(int, grant_reserve_bytes)
__field(int, grant_write_cycle)
@@ -784,8 +784,8 @@ DECLARE_EVENT_CLASS(xfs_loggrant_class,
__entry->curr_res = tic->t_curr_res;
__entry->unit_res = tic->t_unit_res;
__entry->flags = tic->t_flags;
- __entry->reserve_headq = log->l_reserve_headq;
- __entry->write_headq = log->l_write_headq;
+ __entry->reserveq = log->l_reserveq.next;
+ __entry->writeq = log->l_writeq.next;
__entry->grant_reserve_cycle = log->l_grant_reserve_cycle;
__entry->grant_reserve_bytes = log->l_grant_reserve_bytes;
__entry->grant_write_cycle = log->l_grant_write_cycle;
@@ -807,8 +807,8 @@ DECLARE_EVENT_CLASS(xfs_loggrant_class,
__entry->curr_res,
__entry->unit_res,
__print_flags(__entry->flags, "|", XLOG_TIC_FLAGS),
- __entry->reserve_headq,
- __entry->write_headq,
+ __entry->reserveq,
+ __entry->writeq,
__entry->grant_reserve_cycle,
__entry->grant_reserve_bytes,
__entry->grant_write_cycle,
diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
index cee4ab9..1b82735 100644
--- a/fs/xfs/xfs_log.c
+++ b/fs/xfs/xfs_log.c
@@ -95,38 +95,6 @@ STATIC void xlog_verify_tail_lsn(xlog_t *log, xlog_in_core_t *iclog,
STATIC int xlog_iclogs_empty(xlog_t *log);
-
-static void
-xlog_ins_ticketq(struct xlog_ticket **qp, struct xlog_ticket *tic)
-{
- if (*qp) {
- tic->t_next = (*qp);
- tic->t_prev = (*qp)->t_prev;
- (*qp)->t_prev->t_next = tic;
- (*qp)->t_prev = tic;
- } else {
- tic->t_prev = tic->t_next = tic;
- *qp = tic;
- }
-
- tic->t_flags |= XLOG_TIC_IN_Q;
-}
-
-static void
-xlog_del_ticketq(struct xlog_ticket **qp, struct xlog_ticket *tic)
-{
- if (tic == tic->t_next) {
- *qp = NULL;
- } else {
- *qp = tic->t_next;
- tic->t_next->t_prev = tic->t_prev;
- tic->t_prev->t_next = tic->t_next;
- }
-
- tic->t_next = tic->t_prev = NULL;
- tic->t_flags &= ~XLOG_TIC_IN_Q;
-}
-
static void
xlog_grant_sub_space(struct log *log, int bytes)
{
@@ -724,7 +692,7 @@ xfs_log_move_tail(xfs_mount_t *mp,
log->l_tail_lsn = tail_lsn;
}
- if ((tic = log->l_write_headq)) {
+ if (!list_empty(&log->l_writeq)) {
#ifdef DEBUG
if (log->l_flags & XLOG_ACTIVE_RECOVERY)
panic("Recovery problem");
@@ -732,7 +700,7 @@ xfs_log_move_tail(xfs_mount_t *mp,
cycle = log->l_grant_write_cycle;
bytes = log->l_grant_write_bytes;
free_bytes = xlog_space_left(log, cycle, bytes);
- do {
+ list_for_each_entry(tic, &log->l_writeq, t_queue) {
ASSERT(tic->t_flags & XLOG_TIC_PERM_RESERV);
if (free_bytes < tic->t_unit_res && tail_lsn != 1)
@@ -740,10 +708,10 @@ xfs_log_move_tail(xfs_mount_t *mp,
tail_lsn = 0;
free_bytes -= tic->t_unit_res;
sv_signal(&tic->t_wait);
- tic = tic->t_next;
- } while (tic != log->l_write_headq);
+ }
}
- if ((tic = log->l_reserve_headq)) {
+
+ if (!list_empty(&log->l_reserveq)) {
#ifdef DEBUG
if (log->l_flags & XLOG_ACTIVE_RECOVERY)
panic("Recovery problem");
@@ -751,7 +719,7 @@ xfs_log_move_tail(xfs_mount_t *mp,
cycle = log->l_grant_reserve_cycle;
bytes = log->l_grant_reserve_bytes;
free_bytes = xlog_space_left(log, cycle, bytes);
- do {
+ list_for_each_entry(tic, &log->l_reserveq, t_queue) {
if (tic->t_flags & XLOG_TIC_PERM_RESERV)
need_bytes = tic->t_unit_res*tic->t_cnt;
else
@@ -761,8 +729,7 @@ xfs_log_move_tail(xfs_mount_t *mp,
tail_lsn = 0;
free_bytes -= need_bytes;
sv_signal(&tic->t_wait);
- tic = tic->t_next;
- } while (tic != log->l_reserve_headq);
+ }
}
spin_unlock(&log->l_grant_lock);
} /* xfs_log_move_tail */
@@ -1053,6 +1020,8 @@ xlog_alloc_log(xfs_mount_t *mp,
log->l_curr_cycle = 1; /* 0 is bad since this is initial value */
log->l_grant_reserve_cycle = 1;
log->l_grant_write_cycle = 1;
+ INIT_LIST_HEAD(&log->l_reserveq);
+ INIT_LIST_HEAD(&log->l_writeq);
error = EFSCORRUPTED;
if (xfs_sb_version_hassector(&mp->m_sb)) {
@@ -2550,8 +2519,8 @@ xlog_grant_log_space(xlog_t *log,
trace_xfs_log_grant_enter(log, tic);
/* something is already sleeping; insert new transaction at end */
- if (log->l_reserve_headq) {
- xlog_ins_ticketq(&log->l_reserve_headq, tic);
+ if (!list_empty(&log->l_reserveq)) {
+ list_add_tail(&tic->t_queue, &log->l_reserveq);
trace_xfs_log_grant_sleep1(log, tic);
@@ -2583,8 +2552,8 @@ redo:
free_bytes = xlog_space_left(log, log->l_grant_reserve_cycle,
log->l_grant_reserve_bytes);
if (free_bytes < need_bytes) {
- if ((tic->t_flags & XLOG_TIC_IN_Q) == 0)
- xlog_ins_ticketq(&log->l_reserve_headq, tic);
+ if (list_empty(&tic->t_queue))
+ list_add_tail(&tic->t_queue, &log->l_reserveq);
trace_xfs_log_grant_sleep2(log, tic);
@@ -2602,8 +2571,9 @@ redo:
trace_xfs_log_grant_wake2(log, tic);
goto redo;
- } else if (tic->t_flags & XLOG_TIC_IN_Q)
- xlog_del_ticketq(&log->l_reserve_headq, tic);
+ }
+
+ list_del_init(&tic->t_queue);
/* we've got enough space */
xlog_grant_add_space(log, need_bytes);
@@ -2626,9 +2596,7 @@ redo:
return 0;
error_return:
- if (tic->t_flags & XLOG_TIC_IN_Q)
- xlog_del_ticketq(&log->l_reserve_headq, tic);
-
+ list_del_init(&tic->t_queue);
trace_xfs_log_grant_error(log, tic);
/*
@@ -2653,7 +2621,6 @@ xlog_regrant_write_log_space(xlog_t *log,
xlog_ticket_t *tic)
{
int free_bytes, need_bytes;
- xlog_ticket_t *ntic;
#ifdef DEBUG
xfs_lsn_t tail_lsn;
#endif
@@ -2683,22 +2650,23 @@ xlog_regrant_write_log_space(xlog_t *log,
* this transaction.
*/
need_bytes = tic->t_unit_res;
- if ((ntic = log->l_write_headq)) {
+ if (!list_empty(&log->l_writeq)) {
+ struct xlog_ticket *ntic;
free_bytes = xlog_space_left(log, log->l_grant_write_cycle,
log->l_grant_write_bytes);
- do {
+ list_for_each_entry(ntic, &log->l_writeq, t_queue) {
ASSERT(ntic->t_flags & XLOG_TIC_PERM_RESERV);
if (free_bytes < ntic->t_unit_res)
break;
free_bytes -= ntic->t_unit_res;
sv_signal(&ntic->t_wait);
- ntic = ntic->t_next;
- } while (ntic != log->l_write_headq);
+ }
- if (ntic != log->l_write_headq) {
- if ((tic->t_flags & XLOG_TIC_IN_Q) == 0)
- xlog_ins_ticketq(&log->l_write_headq, tic);
+ if (ntic != list_first_entry(&log->l_writeq,
+ struct xlog_ticket, t_queue)) {
+ if (list_empty(&tic->t_queue))
+ list_add_tail(&tic->t_queue, &log->l_writeq);
trace_xfs_log_regrant_write_sleep1(log, tic);
@@ -2727,8 +2695,8 @@ redo:
free_bytes = xlog_space_left(log, log->l_grant_write_cycle,
log->l_grant_write_bytes);
if (free_bytes < need_bytes) {
- if ((tic->t_flags & XLOG_TIC_IN_Q) == 0)
- xlog_ins_ticketq(&log->l_write_headq, tic);
+ if (list_empty(&tic->t_queue))
+ list_add_tail(&tic->t_queue, &log->l_writeq);
spin_unlock(&log->l_grant_lock);
xlog_grant_push_ail(log->l_mp, need_bytes);
spin_lock(&log->l_grant_lock);
@@ -2745,8 +2713,9 @@ redo:
trace_xfs_log_regrant_write_wake2(log, tic);
goto redo;
- } else if (tic->t_flags & XLOG_TIC_IN_Q)
- xlog_del_ticketq(&log->l_write_headq, tic);
+ }
+
+ list_del_init(&tic->t_queue);
/* we've got enough space */
xlog_grant_add_space_write(log, need_bytes);
@@ -2766,9 +2735,7 @@ redo:
error_return:
- if (tic->t_flags & XLOG_TIC_IN_Q)
- xlog_del_ticketq(&log->l_reserve_headq, tic);
-
+ list_del_init(&tic->t_queue);
trace_xfs_log_regrant_write_error(log, tic);
/*
@@ -3435,6 +3402,7 @@ xlog_ticket_alloc(
}
atomic_set(&tic->t_ref, 1);
+ INIT_LIST_HEAD(&tic->t_queue);
tic->t_unit_res = unit_bytes;
tic->t_curr_res = unit_bytes;
tic->t_cnt = cnt;
@@ -3742,26 +3710,17 @@ xfs_log_force_umount(
spin_unlock(&log->l_icloglock);
/*
- * We don't want anybody waiting for log reservations
- * after this. That means we have to wake up everybody
- * queued up on reserve_headq as well as write_headq.
- * In addition, we make sure in xlog_{re}grant_log_space
- * that we don't enqueue anything once the SHUTDOWN flag
- * is set, and this action is protected by the GRANTLOCK.
+ * We don't want anybody waiting for log reservations after this. That
+ * means we have to wake up everybody queued up on reserveq as well as
+ * writeq. In addition, we make sure in xlog_{re}grant_log_space that
+ * we don't enqueue anything once the SHUTDOWN flag is set, and this
+ * action is protected by the GRANTLOCK.
*/
- if ((tic = log->l_reserve_headq)) {
- do {
- sv_signal(&tic->t_wait);
- tic = tic->t_next;
- } while (tic != log->l_reserve_headq);
- }
+ list_for_each_entry(tic, &log->l_reserveq, t_queue)
+ sv_signal(&tic->t_wait);
- if ((tic = log->l_write_headq)) {
- do {
- sv_signal(&tic->t_wait);
- tic = tic->t_next;
- } while (tic != log->l_write_headq);
- }
+ list_for_each_entry(tic, &log->l_writeq, t_queue)
+ sv_signal(&tic->t_wait);
spin_unlock(&log->l_grant_lock);
if (!(log->l_iclog->ic_state & XLOG_STATE_IOERROR)) {
diff --git a/fs/xfs/xfs_log_priv.h b/fs/xfs/xfs_log_priv.h
index edcdfe0..d45fe2d 100644
--- a/fs/xfs/xfs_log_priv.h
+++ b/fs/xfs/xfs_log_priv.h
@@ -133,12 +133,10 @@ static inline uint xlog_get_client_id(__be32 i)
*/
#define XLOG_TIC_INITED 0x1 /* has been initialized */
#define XLOG_TIC_PERM_RESERV 0x2 /* permanent reservation */
-#define XLOG_TIC_IN_Q 0x4
#define XLOG_TIC_FLAGS \
{ XLOG_TIC_INITED, "XLOG_TIC_INITED" }, \
- { XLOG_TIC_PERM_RESERV, "XLOG_TIC_PERM_RESERV" }, \
- { XLOG_TIC_IN_Q, "XLOG_TIC_IN_Q" }
+ { XLOG_TIC_PERM_RESERV, "XLOG_TIC_PERM_RESERV" }
#endif /* __KERNEL__ */
@@ -245,8 +243,7 @@ typedef struct xlog_res {
typedef struct xlog_ticket {
sv_t t_wait; /* ticket wait queue : 20 */
- struct xlog_ticket *t_next; /* :4|8 */
- struct xlog_ticket *t_prev; /* :4|8 */
+ struct list_head t_queue; /* reserve/write queue */
xlog_tid_t t_tid; /* transaction identifier : 4 */
atomic_t t_ref; /* ticket reference count : 4 */
int t_curr_res; /* current reservation in bytes : 4 */
@@ -520,8 +517,8 @@ typedef struct log {
/* The following block of fields are changed while holding grant_lock */
spinlock_t l_grant_lock ____cacheline_aligned_in_smp;
- xlog_ticket_t *l_reserve_headq;
- xlog_ticket_t *l_write_headq;
+ struct list_head l_reserveq;
+ struct list_head l_writeq;
int l_grant_reserve_cycle;
int l_grant_reserve_bytes;
int l_grant_write_cycle;
--
1.7.2.3
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH 02/12] xfs: fact out common grant head/log tail verification code
2010-12-13 4:44 xfs: grant lock scaling and removal V3 Dave Chinner
2010-12-13 4:44 ` [PATCH 01/12] xfs: convert log grant ticket queues to list heads Dave Chinner
@ 2010-12-13 4:44 ` Dave Chinner
2010-12-20 11:51 ` Christoph Hellwig
2010-12-13 4:44 ` [PATCH 03/12] xfs: rework log grant space calculations Dave Chinner
` (9 subsequent siblings)
11 siblings, 1 reply; 24+ messages in thread
From: Dave Chinner @ 2010-12-13 4:44 UTC (permalink / raw)
To: xfs
From: Dave Chinner <dchinner@redhat.com>
Factor repeated debug code out of grant head manipulation functions into a
separate function. This removes ifdef DEBUG spagetti from the code and makes
the code easier to follow.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
fs/xfs/xfs_log.c | 51 ++++++++++++++++++++++-----------------------------
1 files changed, 22 insertions(+), 29 deletions(-)
diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
index 1b82735..99c6285 100644
--- a/fs/xfs/xfs_log.c
+++ b/fs/xfs/xfs_log.c
@@ -82,6 +82,7 @@ STATIC void xlog_ungrant_log_space(xlog_t *log,
#if defined(DEBUG)
STATIC void xlog_verify_dest_ptr(xlog_t *log, char *ptr);
STATIC void xlog_verify_grant_head(xlog_t *log, int equals);
+STATIC void xlog_verify_grant_tail(struct log *log);
STATIC void xlog_verify_iclog(xlog_t *log, xlog_in_core_t *iclog,
int count, boolean_t syncing);
STATIC void xlog_verify_tail_lsn(xlog_t *log, xlog_in_core_t *iclog,
@@ -89,6 +90,7 @@ STATIC void xlog_verify_tail_lsn(xlog_t *log, xlog_in_core_t *iclog,
#else
#define xlog_verify_dest_ptr(a,b)
#define xlog_verify_grant_head(a,b)
+#define xlog_verify_grant_tail(a)
#define xlog_verify_iclog(a,b,c,d)
#define xlog_verify_tail_lsn(a,b,c)
#endif
@@ -2503,10 +2505,6 @@ xlog_grant_log_space(xlog_t *log,
{
int free_bytes;
int need_bytes;
-#ifdef DEBUG
- xfs_lsn_t tail_lsn;
-#endif
-
#ifdef DEBUG
if (log->l_flags & XLOG_ACTIVE_RECOVERY)
@@ -2577,21 +2575,9 @@ redo:
/* we've got enough space */
xlog_grant_add_space(log, need_bytes);
-#ifdef DEBUG
- tail_lsn = log->l_tail_lsn;
- /*
- * Check to make sure the grant write head didn't just over lap the
- * tail. If the cycles are the same, we can't be overlapping.
- * Otherwise, make sure that the cycles differ by exactly one and
- * check the byte count.
- */
- if (CYCLE_LSN(tail_lsn) != log->l_grant_write_cycle) {
- ASSERT(log->l_grant_write_cycle-1 == CYCLE_LSN(tail_lsn));
- ASSERT(log->l_grant_write_bytes <= BBTOB(BLOCK_LSN(tail_lsn)));
- }
-#endif
trace_xfs_log_grant_exit(log, tic);
xlog_verify_grant_head(log, 1);
+ xlog_verify_grant_tail(log);
spin_unlock(&log->l_grant_lock);
return 0;
@@ -2621,9 +2607,6 @@ xlog_regrant_write_log_space(xlog_t *log,
xlog_ticket_t *tic)
{
int free_bytes, need_bytes;
-#ifdef DEBUG
- xfs_lsn_t tail_lsn;
-#endif
tic->t_curr_res = tic->t_unit_res;
xlog_tic_reset_res(tic);
@@ -2719,17 +2702,9 @@ redo:
/* we've got enough space */
xlog_grant_add_space_write(log, need_bytes);
-#ifdef DEBUG
- tail_lsn = log->l_tail_lsn;
- if (CYCLE_LSN(tail_lsn) != log->l_grant_write_cycle) {
- ASSERT(log->l_grant_write_cycle-1 == CYCLE_LSN(tail_lsn));
- ASSERT(log->l_grant_write_bytes <= BBTOB(BLOCK_LSN(tail_lsn)));
- }
-#endif
-
trace_xfs_log_regrant_write_exit(log, tic);
-
xlog_verify_grant_head(log, 1);
+ xlog_verify_grant_tail(log);
spin_unlock(&log->l_grant_lock);
return 0;
@@ -3465,6 +3440,24 @@ xlog_verify_grant_head(xlog_t *log, int equals)
}
} /* xlog_verify_grant_head */
+STATIC void
+xlog_verify_grant_tail(
+ struct log *log)
+{
+ xfs_lsn_t tail_lsn = log->l_tail_lsn;
+
+ /*
+ * Check to make sure the grant write head didn't just over lap the
+ * tail. If the cycles are the same, we can't be overlapping.
+ * Otherwise, make sure that the cycles differ by exactly one and
+ * check the byte count.
+ */
+ if (CYCLE_LSN(tail_lsn) != log->l_grant_write_cycle) {
+ ASSERT(log->l_grant_write_cycle - 1 == CYCLE_LSN(tail_lsn));
+ ASSERT(log->l_grant_write_bytes <= BBTOB(BLOCK_LSN(tail_lsn)));
+ }
+}
+
/* check if it will fit */
STATIC void
xlog_verify_tail_lsn(xlog_t *log,
--
1.7.2.3
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH 03/12] xfs: rework log grant space calculations
2010-12-13 4:44 xfs: grant lock scaling and removal V3 Dave Chinner
2010-12-13 4:44 ` [PATCH 01/12] xfs: convert log grant ticket queues to list heads Dave Chinner
2010-12-13 4:44 ` [PATCH 02/12] xfs: fact out common grant head/log tail verification code Dave Chinner
@ 2010-12-13 4:44 ` Dave Chinner
2010-12-20 11:37 ` Christoph Hellwig
2010-12-13 4:44 ` [PATCH 04/12] xfs: combine grant heads into a single 64 bit integer Dave Chinner
` (8 subsequent siblings)
11 siblings, 1 reply; 24+ messages in thread
From: Dave Chinner @ 2010-12-13 4:44 UTC (permalink / raw)
To: xfs
From: Dave Chinner <dchinner@redhat.com>
The log grant space calculations are repeated for both write and
reserve grant heads. To make it simpler to convert the calculations
toa different algorithm, factor them so both the gratn heads use the
same calculation functions. Once this is done we can drop the
wrappers that are used in only a couple of place to update both
grant heads at once as they don't provide any particular value.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
fs/xfs/xfs_log.c | 95 +++++++++++++++++++++++++++--------------------------
1 files changed, 48 insertions(+), 47 deletions(-)
diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
index 99c6285..9a4b9ed 100644
--- a/fs/xfs/xfs_log.c
+++ b/fs/xfs/xfs_log.c
@@ -98,53 +98,34 @@ STATIC void xlog_verify_tail_lsn(xlog_t *log, xlog_in_core_t *iclog,
STATIC int xlog_iclogs_empty(xlog_t *log);
static void
-xlog_grant_sub_space(struct log *log, int bytes)
-{
- log->l_grant_write_bytes -= bytes;
- if (log->l_grant_write_bytes < 0) {
- log->l_grant_write_bytes += log->l_logsize;
- log->l_grant_write_cycle--;
- }
-
- log->l_grant_reserve_bytes -= bytes;
- if ((log)->l_grant_reserve_bytes < 0) {
- log->l_grant_reserve_bytes += log->l_logsize;
- log->l_grant_reserve_cycle--;
- }
-
-}
-
-static void
-xlog_grant_add_space_write(struct log *log, int bytes)
+xlog_grant_sub_space(
+ struct log *log,
+ int *cycle,
+ int *space,
+ int bytes)
{
- int tmp = log->l_logsize - log->l_grant_write_bytes;
- if (tmp > bytes)
- log->l_grant_write_bytes += bytes;
- else {
- log->l_grant_write_cycle++;
- log->l_grant_write_bytes = bytes - tmp;
+ *space -= bytes;
+ if (*space < 0) {
+ *space += log->l_logsize;
+ (*cycle)--;
}
}
static void
-xlog_grant_add_space_reserve(struct log *log, int bytes)
+xlog_grant_add_space(
+ struct log *log,
+ int *cycle,
+ int *space,
+ int bytes)
{
- int tmp = log->l_logsize - log->l_grant_reserve_bytes;
+ int tmp = log->l_logsize - *space;
if (tmp > bytes)
- log->l_grant_reserve_bytes += bytes;
+ *space += bytes;
else {
- log->l_grant_reserve_cycle++;
- log->l_grant_reserve_bytes = bytes - tmp;
+ *space = bytes - tmp;
+ (*cycle)++;
}
}
-
-static inline void
-xlog_grant_add_space(struct log *log, int bytes)
-{
- xlog_grant_add_space_write(log, bytes);
- xlog_grant_add_space_reserve(log, bytes);
-}
-
static void
xlog_tic_reset_res(xlog_ticket_t *tic)
{
@@ -1344,7 +1325,10 @@ xlog_sync(xlog_t *log,
/* move grant heads by roundoff in sync */
spin_lock(&log->l_grant_lock);
- xlog_grant_add_space(log, roundoff);
+ xlog_grant_add_space(log, &log->l_grant_reserve_cycle,
+ &log->l_grant_reserve_bytes, roundoff);
+ xlog_grant_add_space(log, &log->l_grant_write_cycle,
+ &log->l_grant_write_bytes, roundoff);
spin_unlock(&log->l_grant_lock);
/* put cycle number in every block */
@@ -2574,7 +2558,10 @@ redo:
list_del_init(&tic->t_queue);
/* we've got enough space */
- xlog_grant_add_space(log, need_bytes);
+ xlog_grant_add_space(log, &log->l_grant_reserve_cycle,
+ &log->l_grant_reserve_bytes, need_bytes);
+ xlog_grant_add_space(log, &log->l_grant_write_cycle,
+ &log->l_grant_write_bytes, need_bytes);
trace_xfs_log_grant_exit(log, tic);
xlog_verify_grant_head(log, 1);
xlog_verify_grant_tail(log);
@@ -2701,7 +2688,8 @@ redo:
list_del_init(&tic->t_queue);
/* we've got enough space */
- xlog_grant_add_space_write(log, need_bytes);
+ xlog_grant_add_space(log, &log->l_grant_write_cycle,
+ &log->l_grant_write_bytes, need_bytes);
trace_xfs_log_regrant_write_exit(log, tic);
xlog_verify_grant_head(log, 1);
xlog_verify_grant_tail(log);
@@ -2742,7 +2730,12 @@ xlog_regrant_reserve_log_space(xlog_t *log,
ticket->t_cnt--;
spin_lock(&log->l_grant_lock);
- xlog_grant_sub_space(log, ticket->t_curr_res);
+ xlog_grant_sub_space(log, &log->l_grant_reserve_cycle,
+ &log->l_grant_reserve_bytes,
+ ticket->t_curr_res);
+ xlog_grant_sub_space(log, &log->l_grant_write_cycle,
+ &log->l_grant_write_bytes,
+ ticket->t_curr_res);
ticket->t_curr_res = ticket->t_unit_res;
xlog_tic_reset_res(ticket);
@@ -2756,7 +2749,9 @@ xlog_regrant_reserve_log_space(xlog_t *log,
return;
}
- xlog_grant_add_space_reserve(log, ticket->t_unit_res);
+ xlog_grant_add_space(log, &log->l_grant_reserve_cycle,
+ &log->l_grant_reserve_bytes,
+ ticket->t_unit_res);
trace_xfs_log_regrant_reserve_exit(log, ticket);
@@ -2785,24 +2780,30 @@ STATIC void
xlog_ungrant_log_space(xlog_t *log,
xlog_ticket_t *ticket)
{
+ int bytes;
+
if (ticket->t_cnt > 0)
ticket->t_cnt--;
spin_lock(&log->l_grant_lock);
trace_xfs_log_ungrant_enter(log, ticket);
-
- xlog_grant_sub_space(log, ticket->t_curr_res);
-
trace_xfs_log_ungrant_sub(log, ticket);
- /* If this is a permanent reservation ticket, we may be able to free
+ /*
+ * If this is a permanent reservation ticket, we may be able to free
* up more space based on the remaining count.
*/
+ bytes = ticket->t_curr_res;
if (ticket->t_cnt > 0) {
ASSERT(ticket->t_flags & XLOG_TIC_PERM_RESERV);
- xlog_grant_sub_space(log, ticket->t_unit_res*ticket->t_cnt);
+ bytes += ticket->t_unit_res*ticket->t_cnt;
}
+ xlog_grant_sub_space(log, &log->l_grant_reserve_cycle,
+ &log->l_grant_reserve_bytes, bytes);
+ xlog_grant_sub_space(log, &log->l_grant_write_cycle,
+ &log->l_grant_write_bytes, bytes);
+
trace_xfs_log_ungrant_exit(log, ticket);
xlog_verify_grant_head(log, 1);
--
1.7.2.3
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH 04/12] xfs: combine grant heads into a single 64 bit integer
2010-12-13 4:44 xfs: grant lock scaling and removal V3 Dave Chinner
` (2 preceding siblings ...)
2010-12-13 4:44 ` [PATCH 03/12] xfs: rework log grant space calculations Dave Chinner
@ 2010-12-13 4:44 ` Dave Chinner
2010-12-20 11:40 ` Christoph Hellwig
2010-12-13 4:44 ` [PATCH 05/12] xfs: use wait queues directly for the log wait queues Dave Chinner
` (7 subsequent siblings)
11 siblings, 1 reply; 24+ messages in thread
From: Dave Chinner @ 2010-12-13 4:44 UTC (permalink / raw)
To: xfs
From: Dave Chinner <dchinner@redhat.com>
Prepare for switching the grant heads to atomic variables by
combining the two 32 bit values that make up the grant head into a
single 64 bit variable. Provide wrapper functions to combine and
split the grant heads appropriately for calculations and use them as
necessary.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
fs/xfs/linux-2.6/xfs_trace.h | 10 ++-
fs/xfs/xfs_log.c | 166 ++++++++++++++++++++++--------------------
fs/xfs/xfs_log_priv.h | 26 ++++++-
fs/xfs/xfs_log_recover.c | 8 +-
4 files changed, 119 insertions(+), 91 deletions(-)
diff --git a/fs/xfs/linux-2.6/xfs_trace.h b/fs/xfs/linux-2.6/xfs_trace.h
index f400668..82dfcd7 100644
--- a/fs/xfs/linux-2.6/xfs_trace.h
+++ b/fs/xfs/linux-2.6/xfs_trace.h
@@ -786,10 +786,12 @@ DECLARE_EVENT_CLASS(xfs_loggrant_class,
__entry->flags = tic->t_flags;
__entry->reserveq = log->l_reserveq.next;
__entry->writeq = log->l_writeq.next;
- __entry->grant_reserve_cycle = log->l_grant_reserve_cycle;
- __entry->grant_reserve_bytes = log->l_grant_reserve_bytes;
- __entry->grant_write_cycle = log->l_grant_write_cycle;
- __entry->grant_write_bytes = log->l_grant_write_bytes;
+ xlog_crack_grant_head(&log->l_grant_reserve_head,
+ &__entry->grant_reserve_cycle,
+ &__entry->grant_reserve_bytes);
+ xlog_crack_grant_head(&log->l_grant_write_head,
+ &__entry->grant_write_cycle,
+ &__entry->grant_write_bytes);
__entry->curr_cycle = log->l_curr_cycle;
__entry->curr_block = log->l_curr_block;
__entry->tail_lsn = log->l_tail_lsn;
diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
index 9a4b9ed..3299faa 100644
--- a/fs/xfs/xfs_log.c
+++ b/fs/xfs/xfs_log.c
@@ -47,7 +47,7 @@ STATIC xlog_t * xlog_alloc_log(xfs_mount_t *mp,
xfs_buftarg_t *log_target,
xfs_daddr_t blk_offset,
int num_bblks);
-STATIC int xlog_space_left(xlog_t *log, int cycle, int bytes);
+STATIC int xlog_space_left(struct log *log, int64_t *head);
STATIC int xlog_sync(xlog_t *log, xlog_in_core_t *iclog);
STATIC void xlog_dealloc_log(xlog_t *log);
@@ -100,32 +100,44 @@ STATIC int xlog_iclogs_empty(xlog_t *log);
static void
xlog_grant_sub_space(
struct log *log,
- int *cycle,
- int *space,
+ int64_t *head,
int bytes)
{
- *space -= bytes;
- if (*space < 0) {
- *space += log->l_logsize;
- (*cycle)--;
+ int cycle, space;
+
+ xlog_crack_grant_head(head, &cycle, &space);
+
+ space -= bytes;
+ if (space < 0) {
+ space += log->l_logsize;
+ cycle--;
}
+
+ xlog_assign_grant_head(head, cycle, space);
}
static void
xlog_grant_add_space(
struct log *log,
- int *cycle,
- int *space,
+ int64_t *head,
int bytes)
{
- int tmp = log->l_logsize - *space;
+ int tmp;
+ int cycle, space;
+
+ xlog_crack_grant_head(head, &cycle, &space);
+
+ tmp = log->l_logsize - space;
if (tmp > bytes)
- *space += bytes;
+ space += bytes;
else {
- *space = bytes - tmp;
- (*cycle)++;
+ space = bytes - tmp;
+ cycle++;
}
+
+ xlog_assign_grant_head(head, cycle, space);
}
+
static void
xlog_tic_reset_res(xlog_ticket_t *tic)
{
@@ -654,7 +666,7 @@ xfs_log_move_tail(xfs_mount_t *mp,
{
xlog_ticket_t *tic;
xlog_t *log = mp->m_log;
- int need_bytes, free_bytes, cycle, bytes;
+ int need_bytes, free_bytes;
if (XLOG_FORCED_SHUTDOWN(log))
return;
@@ -680,9 +692,7 @@ xfs_log_move_tail(xfs_mount_t *mp,
if (log->l_flags & XLOG_ACTIVE_RECOVERY)
panic("Recovery problem");
#endif
- cycle = log->l_grant_write_cycle;
- bytes = log->l_grant_write_bytes;
- free_bytes = xlog_space_left(log, cycle, bytes);
+ free_bytes = xlog_space_left(log, &log->l_grant_write_head);
list_for_each_entry(tic, &log->l_writeq, t_queue) {
ASSERT(tic->t_flags & XLOG_TIC_PERM_RESERV);
@@ -699,9 +709,7 @@ xfs_log_move_tail(xfs_mount_t *mp,
if (log->l_flags & XLOG_ACTIVE_RECOVERY)
panic("Recovery problem");
#endif
- cycle = log->l_grant_reserve_cycle;
- bytes = log->l_grant_reserve_bytes;
- free_bytes = xlog_space_left(log, cycle, bytes);
+ free_bytes = xlog_space_left(log, &log->l_grant_reserve_head);
list_for_each_entry(tic, &log->l_reserveq, t_queue) {
if (tic->t_flags & XLOG_TIC_PERM_RESERV)
need_bytes = tic->t_unit_res*tic->t_cnt;
@@ -814,21 +822,26 @@ xlog_assign_tail_lsn(xfs_mount_t *mp)
* result is that we return the size of the log as the amount of space left.
*/
STATIC int
-xlog_space_left(xlog_t *log, int cycle, int bytes)
+xlog_space_left(
+ struct log *log,
+ int64_t *head)
{
- int free_bytes;
- int tail_bytes;
- int tail_cycle;
+ int free_bytes;
+ int tail_bytes;
+ int tail_cycle;
+ int head_cycle;
+ int head_bytes;
+ xlog_crack_grant_head(head, &head_cycle, &head_bytes);
tail_bytes = BBTOB(BLOCK_LSN(log->l_tail_lsn));
tail_cycle = CYCLE_LSN(log->l_tail_lsn);
- if ((tail_cycle == cycle) && (bytes >= tail_bytes)) {
- free_bytes = log->l_logsize - (bytes - tail_bytes);
- } else if ((tail_cycle + 1) < cycle) {
+ if ((tail_cycle == head_cycle) && (head_bytes >= tail_bytes)) {
+ free_bytes = log->l_logsize - (head_bytes - tail_bytes);
+ } else if ((tail_cycle + 1) < head_cycle) {
return 0;
- } else if (tail_cycle < cycle) {
- ASSERT(tail_cycle == (cycle - 1));
- free_bytes = tail_bytes - bytes;
+ } else if (tail_cycle < head_cycle) {
+ ASSERT(tail_cycle == (head_cycle - 1));
+ free_bytes = tail_bytes - head_bytes;
} else {
/*
* The reservation head is behind the tail.
@@ -839,12 +852,12 @@ xlog_space_left(xlog_t *log, int cycle, int bytes)
"xlog_space_left: head behind tail\n"
" tail_cycle = %d, tail_bytes = %d\n"
" GH cycle = %d, GH bytes = %d",
- tail_cycle, tail_bytes, cycle, bytes);
+ tail_cycle, tail_bytes, head_cycle, head_bytes);
ASSERT(0);
free_bytes = log->l_logsize;
}
return free_bytes;
-} /* xlog_space_left */
+}
/*
@@ -1001,8 +1014,8 @@ xlog_alloc_log(xfs_mount_t *mp,
/* log->l_tail_lsn = 0x100000000LL; cycle = 1; current block = 0 */
log->l_last_sync_lsn = log->l_tail_lsn;
log->l_curr_cycle = 1; /* 0 is bad since this is initial value */
- log->l_grant_reserve_cycle = 1;
- log->l_grant_write_cycle = 1;
+ xlog_assign_grant_head(&log->l_grant_reserve_head, 1, 0);
+ xlog_assign_grant_head(&log->l_grant_write_head, 1, 0);
INIT_LIST_HEAD(&log->l_reserveq);
INIT_LIST_HEAD(&log->l_writeq);
@@ -1190,9 +1203,7 @@ xlog_grant_push_ail(xfs_mount_t *mp,
ASSERT(BTOBB(need_bytes) < log->l_logBBsize);
spin_lock(&log->l_grant_lock);
- free_bytes = xlog_space_left(log,
- log->l_grant_reserve_cycle,
- log->l_grant_reserve_bytes);
+ free_bytes = xlog_space_left(log, &log->l_grant_reserve_head);
tail_lsn = log->l_tail_lsn;
free_blocks = BTOBBT(free_bytes);
@@ -1325,10 +1336,8 @@ xlog_sync(xlog_t *log,
/* move grant heads by roundoff in sync */
spin_lock(&log->l_grant_lock);
- xlog_grant_add_space(log, &log->l_grant_reserve_cycle,
- &log->l_grant_reserve_bytes, roundoff);
- xlog_grant_add_space(log, &log->l_grant_write_cycle,
- &log->l_grant_write_bytes, roundoff);
+ xlog_grant_add_space(log, &log->l_grant_reserve_head, roundoff);
+ xlog_grant_add_space(log, &log->l_grant_write_head, roundoff);
spin_unlock(&log->l_grant_lock);
/* put cycle number in every block */
@@ -2531,8 +2540,7 @@ redo:
if (XLOG_FORCED_SHUTDOWN(log))
goto error_return;
- free_bytes = xlog_space_left(log, log->l_grant_reserve_cycle,
- log->l_grant_reserve_bytes);
+ free_bytes = xlog_space_left(log, &log->l_grant_reserve_head);
if (free_bytes < need_bytes) {
if (list_empty(&tic->t_queue))
list_add_tail(&tic->t_queue, &log->l_reserveq);
@@ -2558,10 +2566,8 @@ redo:
list_del_init(&tic->t_queue);
/* we've got enough space */
- xlog_grant_add_space(log, &log->l_grant_reserve_cycle,
- &log->l_grant_reserve_bytes, need_bytes);
- xlog_grant_add_space(log, &log->l_grant_write_cycle,
- &log->l_grant_write_bytes, need_bytes);
+ xlog_grant_add_space(log, &log->l_grant_reserve_head, need_bytes);
+ xlog_grant_add_space(log, &log->l_grant_write_head, need_bytes);
trace_xfs_log_grant_exit(log, tic);
xlog_verify_grant_head(log, 1);
xlog_verify_grant_tail(log);
@@ -2622,8 +2628,7 @@ xlog_regrant_write_log_space(xlog_t *log,
need_bytes = tic->t_unit_res;
if (!list_empty(&log->l_writeq)) {
struct xlog_ticket *ntic;
- free_bytes = xlog_space_left(log, log->l_grant_write_cycle,
- log->l_grant_write_bytes);
+ free_bytes = xlog_space_left(log, &log->l_grant_write_head);
list_for_each_entry(ntic, &log->l_writeq, t_queue) {
ASSERT(ntic->t_flags & XLOG_TIC_PERM_RESERV);
@@ -2662,8 +2667,7 @@ redo:
if (XLOG_FORCED_SHUTDOWN(log))
goto error_return;
- free_bytes = xlog_space_left(log, log->l_grant_write_cycle,
- log->l_grant_write_bytes);
+ free_bytes = xlog_space_left(log, &log->l_grant_write_head);
if (free_bytes < need_bytes) {
if (list_empty(&tic->t_queue))
list_add_tail(&tic->t_queue, &log->l_writeq);
@@ -2688,8 +2692,7 @@ redo:
list_del_init(&tic->t_queue);
/* we've got enough space */
- xlog_grant_add_space(log, &log->l_grant_write_cycle,
- &log->l_grant_write_bytes, need_bytes);
+ xlog_grant_add_space(log, &log->l_grant_write_head, need_bytes);
trace_xfs_log_regrant_write_exit(log, tic);
xlog_verify_grant_head(log, 1);
xlog_verify_grant_tail(log);
@@ -2730,12 +2733,10 @@ xlog_regrant_reserve_log_space(xlog_t *log,
ticket->t_cnt--;
spin_lock(&log->l_grant_lock);
- xlog_grant_sub_space(log, &log->l_grant_reserve_cycle,
- &log->l_grant_reserve_bytes,
- ticket->t_curr_res);
- xlog_grant_sub_space(log, &log->l_grant_write_cycle,
- &log->l_grant_write_bytes,
- ticket->t_curr_res);
+ xlog_grant_sub_space(log, &log->l_grant_reserve_head,
+ ticket->t_curr_res);
+ xlog_grant_sub_space(log, &log->l_grant_write_head,
+ ticket->t_curr_res);
ticket->t_curr_res = ticket->t_unit_res;
xlog_tic_reset_res(ticket);
@@ -2749,9 +2750,8 @@ xlog_regrant_reserve_log_space(xlog_t *log,
return;
}
- xlog_grant_add_space(log, &log->l_grant_reserve_cycle,
- &log->l_grant_reserve_bytes,
- ticket->t_unit_res);
+ xlog_grant_add_space(log, &log->l_grant_reserve_head,
+ ticket->t_unit_res);
trace_xfs_log_regrant_reserve_exit(log, ticket);
@@ -2799,10 +2799,8 @@ xlog_ungrant_log_space(xlog_t *log,
bytes += ticket->t_unit_res*ticket->t_cnt;
}
- xlog_grant_sub_space(log, &log->l_grant_reserve_cycle,
- &log->l_grant_reserve_bytes, bytes);
- xlog_grant_sub_space(log, &log->l_grant_write_cycle,
- &log->l_grant_write_bytes, bytes);
+ xlog_grant_sub_space(log, &log->l_grant_reserve_head, bytes);
+ xlog_grant_sub_space(log, &log->l_grant_write_head, bytes);
trace_xfs_log_ungrant_exit(log, ticket);
@@ -3430,22 +3428,31 @@ xlog_verify_dest_ptr(
STATIC void
xlog_verify_grant_head(xlog_t *log, int equals)
{
- if (log->l_grant_reserve_cycle == log->l_grant_write_cycle) {
- if (equals)
- ASSERT(log->l_grant_reserve_bytes >= log->l_grant_write_bytes);
- else
- ASSERT(log->l_grant_reserve_bytes > log->l_grant_write_bytes);
- } else {
- ASSERT(log->l_grant_reserve_cycle-1 == log->l_grant_write_cycle);
- ASSERT(log->l_grant_write_bytes >= log->l_grant_reserve_bytes);
- }
-} /* xlog_verify_grant_head */
+ int reserve_cycle, reserve_space;
+ int write_cycle, write_space;
+
+ xlog_crack_grant_head(&log->l_grant_reserve_head,
+ &reserve_cycle, &reserve_space);
+ xlog_crack_grant_head(&log->l_grant_write_head,
+ &write_cycle, &write_space);
+
+ if (reserve_cycle == write_cycle) {
+ if (equals)
+ ASSERT(reserve_space >= write_space);
+ else
+ ASSERT(reserve_space > write_space);
+ } else {
+ ASSERT(reserve_cycle - 1 == write_cycle);
+ ASSERT(write_space >= reserve_space);
+ }
+}
STATIC void
xlog_verify_grant_tail(
struct log *log)
{
xfs_lsn_t tail_lsn = log->l_tail_lsn;
+ int cycle, space;
/*
* Check to make sure the grant write head didn't just over lap the
@@ -3453,9 +3460,10 @@ xlog_verify_grant_tail(
* Otherwise, make sure that the cycles differ by exactly one and
* check the byte count.
*/
- if (CYCLE_LSN(tail_lsn) != log->l_grant_write_cycle) {
- ASSERT(log->l_grant_write_cycle - 1 == CYCLE_LSN(tail_lsn));
- ASSERT(log->l_grant_write_bytes <= BBTOB(BLOCK_LSN(tail_lsn)));
+ xlog_crack_grant_head(&log->l_grant_write_head, &cycle, &space);
+ if (CYCLE_LSN(tail_lsn) != cycle) {
+ ASSERT(cycle - 1 == CYCLE_LSN(tail_lsn));
+ ASSERT(space <= BBTOB(BLOCK_LSN(tail_lsn)));
}
}
diff --git a/fs/xfs/xfs_log_priv.h b/fs/xfs/xfs_log_priv.h
index d45fe2d..2fd4819 100644
--- a/fs/xfs/xfs_log_priv.h
+++ b/fs/xfs/xfs_log_priv.h
@@ -519,10 +519,8 @@ typedef struct log {
spinlock_t l_grant_lock ____cacheline_aligned_in_smp;
struct list_head l_reserveq;
struct list_head l_writeq;
- int l_grant_reserve_cycle;
- int l_grant_reserve_bytes;
- int l_grant_write_cycle;
- int l_grant_write_bytes;
+ int64_t l_grant_reserve_head;
+ int64_t l_grant_write_head;
/* The following field are used for debugging; need to hold icloglock */
#ifdef DEBUG
@@ -559,6 +557,26 @@ int xlog_write(struct log *log, struct xfs_log_vec *log_vector,
xlog_in_core_t **commit_iclog, uint flags);
/*
+ * When we crack the grrant head, we sample it first so that the value will not
+ * change while we are cracking it into the component values. This means we
+ * will always get consistent component values to work from.
+ */
+static inline void
+xlog_crack_grant_head(int64_t *head, int *cycle, int *space)
+{
+ int64_t val = *head;
+
+ *cycle = val >> 32;
+ *space = val & 0xffffffff;
+}
+
+static inline void
+xlog_assign_grant_head(int64_t *head, int cycle, int space)
+{
+ *head = ((int64_t)cycle << 32) | space;
+}
+
+/*
* Committed Item List interfaces
*/
int xlog_cil_init(struct log *log);
diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index e23f7be..3e91975 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -927,10 +927,10 @@ xlog_find_tail(
log->l_curr_cycle++;
log->l_tail_lsn = be64_to_cpu(rhead->h_tail_lsn);
log->l_last_sync_lsn = be64_to_cpu(rhead->h_lsn);
- log->l_grant_reserve_cycle = log->l_curr_cycle;
- log->l_grant_reserve_bytes = BBTOB(log->l_curr_block);
- log->l_grant_write_cycle = log->l_curr_cycle;
- log->l_grant_write_bytes = BBTOB(log->l_curr_block);
+ xlog_assign_grant_head(&log->l_grant_reserve_head, log->l_curr_cycle,
+ BBTOB(log->l_curr_block));
+ xlog_assign_grant_head(&log->l_grant_write_head, log->l_curr_cycle,
+ BBTOB(log->l_curr_block));
/*
* Look for unmount record. If we find it, then we know there
--
1.7.2.3
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH 05/12] xfs: use wait queues directly for the log wait queues
2010-12-13 4:44 xfs: grant lock scaling and removal V3 Dave Chinner
` (3 preceding siblings ...)
2010-12-13 4:44 ` [PATCH 04/12] xfs: combine grant heads into a single 64 bit integer Dave Chinner
@ 2010-12-13 4:44 ` Dave Chinner
2010-12-20 11:40 ` Christoph Hellwig
2010-12-13 4:44 ` [PATCH 06/12] xfs: make AIL tail pushing independent of the grant lock Dave Chinner
` (6 subsequent siblings)
11 siblings, 1 reply; 24+ messages in thread
From: Dave Chinner @ 2010-12-13 4:44 UTC (permalink / raw)
To: xfs
From: Dave Chinner <dchinner@redhat.com>
The log grant queues are one of the few places left using sv_t
constructs for waiting. Given we are touching this code, we should
convert them to plain wait queues. While there, convert all the
other sv_t users in the log code as well.
Seeing as this removes the last users of the sv_t type, remove the
header file defining the wrapper and the fragments that still
reference it.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
fs/xfs/linux-2.6/sv.h | 59 --------------------------------------
fs/xfs/linux-2.6/xfs_linux.h | 1 -
fs/xfs/quota/xfs_dquot.c | 1 -
fs/xfs/xfs_log.c | 64 ++++++++++++++++++-----------------------
fs/xfs/xfs_log_cil.c | 8 ++--
fs/xfs/xfs_log_priv.h | 25 +++++++++++++---
6 files changed, 52 insertions(+), 106 deletions(-)
delete mode 100644 fs/xfs/linux-2.6/sv.h
diff --git a/fs/xfs/linux-2.6/sv.h b/fs/xfs/linux-2.6/sv.h
deleted file mode 100644
index 4dfc7c3..0000000
--- a/fs/xfs/linux-2.6/sv.h
+++ /dev/null
@@ -1,59 +0,0 @@
-/*
- * Copyright (c) 2000-2002,2005 Silicon Graphics, Inc.
- * All Rights Reserved.
- *
- * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Public License as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it would be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program; if not, write the Free Software Foundation,
- * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
- */
-#ifndef __XFS_SUPPORT_SV_H__
-#define __XFS_SUPPORT_SV_H__
-
-#include <linux/wait.h>
-#include <linux/sched.h>
-#include <linux/spinlock.h>
-
-/*
- * Synchronisation variables.
- *
- * (Parameters "pri", "svf" and "rts" are not implemented)
- */
-
-typedef struct sv_s {
- wait_queue_head_t waiters;
-} sv_t;
-
-static inline void _sv_wait(sv_t *sv, spinlock_t *lock)
-{
- DECLARE_WAITQUEUE(wait, current);
-
- add_wait_queue_exclusive(&sv->waiters, &wait);
- __set_current_state(TASK_UNINTERRUPTIBLE);
- spin_unlock(lock);
-
- schedule();
-
- remove_wait_queue(&sv->waiters, &wait);
-}
-
-#define sv_init(sv,flag,name) \
- init_waitqueue_head(&(sv)->waiters)
-#define sv_destroy(sv) \
- /*NOTHING*/
-#define sv_wait(sv, pri, lock, s) \
- _sv_wait(sv, lock)
-#define sv_signal(sv) \
- wake_up(&(sv)->waiters)
-#define sv_broadcast(sv) \
- wake_up_all(&(sv)->waiters)
-
-#endif /* __XFS_SUPPORT_SV_H__ */
diff --git a/fs/xfs/linux-2.6/xfs_linux.h b/fs/xfs/linux-2.6/xfs_linux.h
index 9fa4f2a..ccebd86 100644
--- a/fs/xfs/linux-2.6/xfs_linux.h
+++ b/fs/xfs/linux-2.6/xfs_linux.h
@@ -37,7 +37,6 @@
#include <kmem.h>
#include <mrlock.h>
-#include <sv.h>
#include <time.h>
#include <support/debug.h>
diff --git a/fs/xfs/quota/xfs_dquot.c b/fs/xfs/quota/xfs_dquot.c
index faf8e1a..d22aa31 100644
--- a/fs/xfs/quota/xfs_dquot.c
+++ b/fs/xfs/quota/xfs_dquot.c
@@ -149,7 +149,6 @@ xfs_qm_dqdestroy(
ASSERT(list_empty(&dqp->q_freelist));
mutex_destroy(&dqp->q_qlock);
- sv_destroy(&dqp->q_pinwait);
kmem_zone_free(xfs_Gqm->qm_dqzone, dqp);
atomic_dec(&xfs_Gqm->qm_totaldquots);
diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
index 3299faa..3656d77 100644
--- a/fs/xfs/xfs_log.c
+++ b/fs/xfs/xfs_log.c
@@ -547,8 +547,8 @@ xfs_log_unmount_write(xfs_mount_t *mp)
if (!(iclog->ic_state == XLOG_STATE_ACTIVE ||
iclog->ic_state == XLOG_STATE_DIRTY)) {
if (!XLOG_FORCED_SHUTDOWN(log)) {
- sv_wait(&iclog->ic_force_wait, PMEM,
- &log->l_icloglock, s);
+ xlog_wait(&iclog->ic_force_wait,
+ &log->l_icloglock);
} else {
spin_unlock(&log->l_icloglock);
}
@@ -588,8 +588,8 @@ xfs_log_unmount_write(xfs_mount_t *mp)
|| iclog->ic_state == XLOG_STATE_DIRTY
|| iclog->ic_state == XLOG_STATE_IOERROR) ) {
- sv_wait(&iclog->ic_force_wait, PMEM,
- &log->l_icloglock, s);
+ xlog_wait(&iclog->ic_force_wait,
+ &log->l_icloglock);
} else {
spin_unlock(&log->l_icloglock);
}
@@ -700,7 +700,7 @@ xfs_log_move_tail(xfs_mount_t *mp,
break;
tail_lsn = 0;
free_bytes -= tic->t_unit_res;
- sv_signal(&tic->t_wait);
+ wake_up(&tic->t_wait);
}
}
@@ -719,7 +719,7 @@ xfs_log_move_tail(xfs_mount_t *mp,
break;
tail_lsn = 0;
free_bytes -= need_bytes;
- sv_signal(&tic->t_wait);
+ wake_up(&tic->t_wait);
}
}
spin_unlock(&log->l_grant_lock);
@@ -1060,7 +1060,7 @@ xlog_alloc_log(xfs_mount_t *mp,
spin_lock_init(&log->l_icloglock);
spin_lock_init(&log->l_grant_lock);
- sv_init(&log->l_flush_wait, 0, "flush_wait");
+ init_waitqueue_head(&log->l_flush_wait);
/* log record size must be multiple of BBSIZE; see xlog_rec_header_t */
ASSERT((XFS_BUF_SIZE(bp) & BBMASK) == 0);
@@ -1116,8 +1116,8 @@ xlog_alloc_log(xfs_mount_t *mp,
ASSERT(XFS_BUF_ISBUSY(iclog->ic_bp));
ASSERT(XFS_BUF_VALUSEMA(iclog->ic_bp) <= 0);
- sv_init(&iclog->ic_force_wait, SV_DEFAULT, "iclog-force");
- sv_init(&iclog->ic_write_wait, SV_DEFAULT, "iclog-write");
+ init_waitqueue_head(&iclog->ic_force_wait);
+ init_waitqueue_head(&iclog->ic_write_wait);
iclogp = &iclog->ic_next;
}
@@ -1132,11 +1132,8 @@ xlog_alloc_log(xfs_mount_t *mp,
out_free_iclog:
for (iclog = log->l_iclog; iclog; iclog = prev_iclog) {
prev_iclog = iclog->ic_next;
- if (iclog->ic_bp) {
- sv_destroy(&iclog->ic_force_wait);
- sv_destroy(&iclog->ic_write_wait);
+ if (iclog->ic_bp)
xfs_buf_free(iclog->ic_bp);
- }
kmem_free(iclog);
}
spinlock_destroy(&log->l_icloglock);
@@ -1453,8 +1450,6 @@ xlog_dealloc_log(xlog_t *log)
iclog = log->l_iclog;
for (i=0; i<log->l_iclog_bufs; i++) {
- sv_destroy(&iclog->ic_force_wait);
- sv_destroy(&iclog->ic_write_wait);
xfs_buf_free(iclog->ic_bp);
next_iclog = iclog->ic_next;
kmem_free(iclog);
@@ -2261,7 +2256,7 @@ xlog_state_do_callback(
xlog_state_clean_log(log);
/* wake up threads waiting in xfs_log_force() */
- sv_broadcast(&iclog->ic_force_wait);
+ wake_up_all(&iclog->ic_force_wait);
iclog = iclog->ic_next;
} while (first_iclog != iclog);
@@ -2308,7 +2303,7 @@ xlog_state_do_callback(
spin_unlock(&log->l_icloglock);
if (wake)
- sv_broadcast(&log->l_flush_wait);
+ wake_up_all(&log->l_flush_wait);
}
@@ -2359,7 +2354,7 @@ xlog_state_done_syncing(
* iclog buffer, we wake them all, one will get to do the
* I/O, the others get to wait for the result.
*/
- sv_broadcast(&iclog->ic_write_wait);
+ wake_up_all(&iclog->ic_write_wait);
spin_unlock(&log->l_icloglock);
xlog_state_do_callback(log, aborted, iclog); /* also cleans log */
} /* xlog_state_done_syncing */
@@ -2408,7 +2403,7 @@ restart:
XFS_STATS_INC(xs_log_noiclogs);
/* Wait for log writes to have flushed */
- sv_wait(&log->l_flush_wait, 0, &log->l_icloglock, 0);
+ xlog_wait(&log->l_flush_wait, &log->l_icloglock);
goto restart;
}
@@ -2523,7 +2518,8 @@ xlog_grant_log_space(xlog_t *log,
goto error_return;
XFS_STATS_INC(xs_sleep_logspace);
- sv_wait(&tic->t_wait, PINOD|PLTWAIT, &log->l_grant_lock, s);
+ xlog_wait(&tic->t_wait, &log->l_grant_lock);
+
/*
* If we got an error, and the filesystem is shutting down,
* we'll catch it down below. So just continue...
@@ -2552,7 +2548,7 @@ redo:
spin_lock(&log->l_grant_lock);
XFS_STATS_INC(xs_sleep_logspace);
- sv_wait(&tic->t_wait, PINOD|PLTWAIT, &log->l_grant_lock, s);
+ xlog_wait(&tic->t_wait, &log->l_grant_lock);
spin_lock(&log->l_grant_lock);
if (XLOG_FORCED_SHUTDOWN(log))
@@ -2635,7 +2631,7 @@ xlog_regrant_write_log_space(xlog_t *log,
if (free_bytes < ntic->t_unit_res)
break;
free_bytes -= ntic->t_unit_res;
- sv_signal(&ntic->t_wait);
+ wake_up(&ntic->t_wait);
}
if (ntic != list_first_entry(&log->l_writeq,
@@ -2650,8 +2646,7 @@ xlog_regrant_write_log_space(xlog_t *log,
spin_lock(&log->l_grant_lock);
XFS_STATS_INC(xs_sleep_logspace);
- sv_wait(&tic->t_wait, PINOD|PLTWAIT,
- &log->l_grant_lock, s);
+ xlog_wait(&tic->t_wait, &log->l_grant_lock);
/* If we're shutting down, this tic is already
* off the queue */
@@ -2677,8 +2672,7 @@ redo:
XFS_STATS_INC(xs_sleep_logspace);
trace_xfs_log_regrant_write_sleep2(log, tic);
-
- sv_wait(&tic->t_wait, PINOD|PLTWAIT, &log->l_grant_lock, s);
+ xlog_wait(&tic->t_wait, &log->l_grant_lock);
/* If we're shutting down, this tic is already off the queue */
spin_lock(&log->l_grant_lock);
@@ -3029,7 +3023,7 @@ maybe_sleep:
return XFS_ERROR(EIO);
}
XFS_STATS_INC(xs_log_force_sleep);
- sv_wait(&iclog->ic_force_wait, PINOD, &log->l_icloglock, s);
+ xlog_wait(&iclog->ic_force_wait, &log->l_icloglock);
/*
* No need to grab the log lock here since we're
* only deciding whether or not to return EIO
@@ -3147,8 +3141,8 @@ try_again:
XFS_STATS_INC(xs_log_force_sleep);
- sv_wait(&iclog->ic_prev->ic_write_wait,
- PSWP, &log->l_icloglock, s);
+ xlog_wait(&iclog->ic_prev->ic_write_wait,
+ &log->l_icloglock);
if (log_flushed)
*log_flushed = 1;
already_slept = 1;
@@ -3176,7 +3170,7 @@ try_again:
return XFS_ERROR(EIO);
}
XFS_STATS_INC(xs_log_force_sleep);
- sv_wait(&iclog->ic_force_wait, PSWP, &log->l_icloglock, s);
+ xlog_wait(&iclog->ic_force_wait, &log->l_icloglock);
/*
* No need to grab the log lock here since we're
* only deciding whether or not to return EIO
@@ -3251,10 +3245,8 @@ xfs_log_ticket_put(
xlog_ticket_t *ticket)
{
ASSERT(atomic_read(&ticket->t_ref) > 0);
- if (atomic_dec_and_test(&ticket->t_ref)) {
- sv_destroy(&ticket->t_wait);
+ if (atomic_dec_and_test(&ticket->t_ref))
kmem_zone_free(xfs_log_ticket_zone, ticket);
- }
}
xlog_ticket_t *
@@ -3387,7 +3379,7 @@ xlog_ticket_alloc(
tic->t_trans_type = 0;
if (xflags & XFS_LOG_PERM_RESERV)
tic->t_flags |= XLOG_TIC_PERM_RESERV;
- sv_init(&tic->t_wait, SV_DEFAULT, "logtick");
+ init_waitqueue_head(&tic->t_wait);
xlog_tic_reset_res(tic);
@@ -3719,10 +3711,10 @@ xfs_log_force_umount(
* action is protected by the GRANTLOCK.
*/
list_for_each_entry(tic, &log->l_reserveq, t_queue)
- sv_signal(&tic->t_wait);
+ wake_up(&tic->t_wait);
list_for_each_entry(tic, &log->l_writeq, t_queue)
- sv_signal(&tic->t_wait);
+ wake_up(&tic->t_wait);
spin_unlock(&log->l_grant_lock);
if (!(log->l_iclog->ic_state & XLOG_STATE_IOERROR)) {
diff --git a/fs/xfs/xfs_log_cil.c b/fs/xfs/xfs_log_cil.c
index f36f1a2..9dc8125 100644
--- a/fs/xfs/xfs_log_cil.c
+++ b/fs/xfs/xfs_log_cil.c
@@ -61,7 +61,7 @@ xlog_cil_init(
INIT_LIST_HEAD(&cil->xc_committing);
spin_lock_init(&cil->xc_cil_lock);
init_rwsem(&cil->xc_ctx_lock);
- sv_init(&cil->xc_commit_wait, SV_DEFAULT, "cilwait");
+ init_waitqueue_head(&cil->xc_commit_wait);
INIT_LIST_HEAD(&ctx->committing);
INIT_LIST_HEAD(&ctx->busy_extents);
@@ -563,7 +563,7 @@ restart:
* It is still being pushed! Wait for the push to
* complete, then start again from the beginning.
*/
- sv_wait(&cil->xc_commit_wait, 0, &cil->xc_cil_lock, 0);
+ xlog_wait(&cil->xc_commit_wait, &cil->xc_cil_lock);
goto restart;
}
}
@@ -587,7 +587,7 @@ restart:
*/
spin_lock(&cil->xc_cil_lock);
ctx->commit_lsn = commit_lsn;
- sv_broadcast(&cil->xc_commit_wait);
+ wake_up_all(&cil->xc_commit_wait);
spin_unlock(&cil->xc_cil_lock);
/* release the hounds! */
@@ -752,7 +752,7 @@ restart:
* It is still being pushed! Wait for the push to
* complete, then start again from the beginning.
*/
- sv_wait(&cil->xc_commit_wait, 0, &cil->xc_cil_lock, 0);
+ xlog_wait(&cil->xc_commit_wait, &cil->xc_cil_lock);
goto restart;
}
if (ctx->sequence != sequence)
diff --git a/fs/xfs/xfs_log_priv.h b/fs/xfs/xfs_log_priv.h
index 2fd4819..84c5e79 100644
--- a/fs/xfs/xfs_log_priv.h
+++ b/fs/xfs/xfs_log_priv.h
@@ -242,7 +242,7 @@ typedef struct xlog_res {
} xlog_res_t;
typedef struct xlog_ticket {
- sv_t t_wait; /* ticket wait queue : 20 */
+ wait_queue_head_t t_wait; /* ticket wait queue */
struct list_head t_queue; /* reserve/write queue */
xlog_tid_t t_tid; /* transaction identifier : 4 */
atomic_t t_ref; /* ticket reference count : 4 */
@@ -350,8 +350,8 @@ typedef union xlog_in_core2 {
* and move everything else out to subsequent cachelines.
*/
typedef struct xlog_in_core {
- sv_t ic_force_wait;
- sv_t ic_write_wait;
+ wait_queue_head_t ic_force_wait;
+ wait_queue_head_t ic_write_wait;
struct xlog_in_core *ic_next;
struct xlog_in_core *ic_prev;
struct xfs_buf *ic_bp;
@@ -418,7 +418,7 @@ struct xfs_cil {
struct xfs_cil_ctx *xc_ctx;
struct rw_semaphore xc_ctx_lock;
struct list_head xc_committing;
- sv_t xc_commit_wait;
+ wait_queue_head_t xc_commit_wait;
xfs_lsn_t xc_current_sequence;
};
@@ -500,7 +500,7 @@ typedef struct log {
int l_logBBsize; /* size of log in BB chunks */
/* The following block of fields are changed while holding icloglock */
- sv_t l_flush_wait ____cacheline_aligned_in_smp;
+ wait_queue_head_t l_flush_wait ____cacheline_aligned_in_smp;
/* waiting for iclog flush */
int l_covered_state;/* state of "covering disk
* log entries" */
@@ -600,6 +600,21 @@ xlog_cil_force(struct log *log)
*/
#define XLOG_UNMOUNT_REC_TYPE (-1U)
+/*
+ * Wrapper function for waiting on a wait queue serialised against wakeups
+ * by a spinlock. This matches the semantics of all the wait queues used in the
+ * log code.
+ */
+static inline void xlog_wait(wait_queue_head_t *wq, spinlock_t *lock)
+{
+ DECLARE_WAITQUEUE(wait, current);
+
+ add_wait_queue_exclusive(wq, &wait);
+ __set_current_state(TASK_UNINTERRUPTIBLE);
+ spin_unlock(lock);
+ schedule();
+ remove_wait_queue(wq, &wait);
+}
#endif /* __KERNEL__ */
#endif /* __XFS_LOG_PRIV_H__ */
--
1.7.2.3
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH 06/12] xfs: make AIL tail pushing independent of the grant lock
2010-12-13 4:44 xfs: grant lock scaling and removal V3 Dave Chinner
` (4 preceding siblings ...)
2010-12-13 4:44 ` [PATCH 05/12] xfs: use wait queues directly for the log wait queues Dave Chinner
@ 2010-12-13 4:44 ` Dave Chinner
2010-12-20 11:41 ` Christoph Hellwig
2010-12-13 4:44 ` [PATCH 07/12] xfs: convert l_last_sync_lsn to an atomic variable Dave Chinner
` (5 subsequent siblings)
11 siblings, 1 reply; 24+ messages in thread
From: Dave Chinner @ 2010-12-13 4:44 UTC (permalink / raw)
To: xfs
From: Dave Chinner <dchinner@redhat.com>
The xlog_grant_push_ail() currently takes the grant lock internally to sample
the tail lsn, last sync lsn and the reserve grant head. Most of the callers
already hold the grant lock but have to drop it before calling
xlog_grant_push_ail(). This is a left over from when the AIL tail pushing was
done in line and hence xlog_grant_push_ail had to drop the grant lock. AIL push
is now done in another thread and hence we can safely hold the grant lock over
the entire xlog_grant_push_ail call.
Push the grant lock outside of xlog_grant_push_ail() to simplify the locking
and synchronisation needed for tail pushing. This will reduce traffic on the
grant lock by itself, but this is only one step in preparing for the complete
removal of the grant lock.
While there, clean up the formatting of xlog_grant_push_ail() to match the
rest of the XFS code.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
fs/xfs/xfs_log.c | 111 ++++++++++++++++++++++++++---------------------------
1 files changed, 54 insertions(+), 57 deletions(-)
diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
index 3656d77..d15bd5b 100644
--- a/fs/xfs/xfs_log.c
+++ b/fs/xfs/xfs_log.c
@@ -70,7 +70,7 @@ STATIC void xlog_state_want_sync(xlog_t *log, xlog_in_core_t *iclog);
/* local functions to manipulate grant head */
STATIC int xlog_grant_log_space(xlog_t *log,
xlog_ticket_t *xtic);
-STATIC void xlog_grant_push_ail(xfs_mount_t *mp,
+STATIC void xlog_grant_push_ail(struct log *log,
int need_bytes);
STATIC void xlog_regrant_reserve_log_space(xlog_t *log,
xlog_ticket_t *ticket);
@@ -318,7 +318,9 @@ xfs_log_reserve(
trace_xfs_log_reserve(log, internal_ticket);
- xlog_grant_push_ail(mp, internal_ticket->t_unit_res);
+ spin_lock(&log->l_grant_lock);
+ xlog_grant_push_ail(log, internal_ticket->t_unit_res);
+ spin_unlock(&log->l_grant_lock);
retval = xlog_regrant_write_log_space(log, internal_ticket);
} else {
/* may sleep if need to allocate more tickets */
@@ -332,9 +334,11 @@ xfs_log_reserve(
trace_xfs_log_reserve(log, internal_ticket);
- xlog_grant_push_ail(mp,
+ spin_lock(&log->l_grant_lock);
+ xlog_grant_push_ail(log,
(internal_ticket->t_unit_res *
internal_ticket->t_cnt));
+ spin_unlock(&log->l_grant_lock);
retval = xlog_grant_log_space(log, internal_ticket);
}
@@ -1185,59 +1189,58 @@ xlog_commit_record(
* water mark. In this manner, we would be creating a low water mark.
*/
STATIC void
-xlog_grant_push_ail(xfs_mount_t *mp,
- int need_bytes)
+xlog_grant_push_ail(
+ struct log *log,
+ int need_bytes)
{
- xlog_t *log = mp->m_log; /* pointer to the log */
- xfs_lsn_t tail_lsn; /* lsn of the log tail */
- xfs_lsn_t threshold_lsn = 0; /* lsn we'd like to be at */
- int free_blocks; /* free blocks left to write to */
- int free_bytes; /* free bytes left to write to */
- int threshold_block; /* block in lsn we'd like to be at */
- int threshold_cycle; /* lsn cycle we'd like to be at */
- int free_threshold;
-
- ASSERT(BTOBB(need_bytes) < log->l_logBBsize);
-
- spin_lock(&log->l_grant_lock);
- free_bytes = xlog_space_left(log, &log->l_grant_reserve_head);
- tail_lsn = log->l_tail_lsn;
- free_blocks = BTOBBT(free_bytes);
-
- /*
- * Set the threshold for the minimum number of free blocks in the
- * log to the maximum of what the caller needs, one quarter of the
- * log, and 256 blocks.
- */
- free_threshold = BTOBB(need_bytes);
- free_threshold = MAX(free_threshold, (log->l_logBBsize >> 2));
- free_threshold = MAX(free_threshold, 256);
- if (free_blocks < free_threshold) {
+ xfs_lsn_t threshold_lsn = 0;
+ xfs_lsn_t tail_lsn;
+ int free_blocks;
+ int free_bytes;
+ int threshold_block;
+ int threshold_cycle;
+ int free_threshold;
+
+ ASSERT(BTOBB(need_bytes) < log->l_logBBsize);
+
+ tail_lsn = log->l_tail_lsn;
+ free_bytes = xlog_space_left(log, &log->l_grant_reserve_head);
+ free_blocks = BTOBBT(free_bytes);
+
+ /*
+ * Set the threshold for the minimum number of free blocks in the
+ * log to the maximum of what the caller needs, one quarter of the
+ * log, and 256 blocks.
+ */
+ free_threshold = BTOBB(need_bytes);
+ free_threshold = MAX(free_threshold, (log->l_logBBsize >> 2));
+ free_threshold = MAX(free_threshold, 256);
+ if (free_blocks >= free_threshold)
+ return;
+
threshold_block = BLOCK_LSN(tail_lsn) + free_threshold;
threshold_cycle = CYCLE_LSN(tail_lsn);
if (threshold_block >= log->l_logBBsize) {
- threshold_block -= log->l_logBBsize;
- threshold_cycle += 1;
+ threshold_block -= log->l_logBBsize;
+ threshold_cycle += 1;
}
- threshold_lsn = xlog_assign_lsn(threshold_cycle, threshold_block);
-
- /* Don't pass in an lsn greater than the lsn of the last
+ threshold_lsn = xlog_assign_lsn(threshold_cycle,
+ threshold_block);
+ /*
+ * Don't pass in an lsn greater than the lsn of the last
* log record known to be on disk.
*/
if (XFS_LSN_CMP(threshold_lsn, log->l_last_sync_lsn) > 0)
- threshold_lsn = log->l_last_sync_lsn;
- }
- spin_unlock(&log->l_grant_lock);
-
- /*
- * Get the transaction layer to kick the dirty buffers out to
- * disk asynchronously. No point in trying to do this if
- * the filesystem is shutting down.
- */
- if (threshold_lsn &&
- !XLOG_FORCED_SHUTDOWN(log))
- xfs_trans_ail_push(log->l_ailp, threshold_lsn);
-} /* xlog_grant_push_ail */
+ threshold_lsn = log->l_last_sync_lsn;
+
+ /*
+ * Get the transaction layer to kick the dirty buffers out to
+ * disk asynchronously. No point in trying to do this if
+ * the filesystem is shutting down.
+ */
+ if (!XLOG_FORCED_SHUTDOWN(log))
+ xfs_trans_ail_push(log->l_ailp, threshold_lsn);
+}
/*
* The bdstrat callback function for log bufs. This gives us a central
@@ -2543,9 +2546,7 @@ redo:
trace_xfs_log_grant_sleep2(log, tic);
- spin_unlock(&log->l_grant_lock);
- xlog_grant_push_ail(log->l_mp, need_bytes);
- spin_lock(&log->l_grant_lock);
+ xlog_grant_push_ail(log, need_bytes);
XFS_STATS_INC(xs_sleep_logspace);
xlog_wait(&tic->t_wait, &log->l_grant_lock);
@@ -2641,9 +2642,7 @@ xlog_regrant_write_log_space(xlog_t *log,
trace_xfs_log_regrant_write_sleep1(log, tic);
- spin_unlock(&log->l_grant_lock);
- xlog_grant_push_ail(log->l_mp, need_bytes);
- spin_lock(&log->l_grant_lock);
+ xlog_grant_push_ail(log, need_bytes);
XFS_STATS_INC(xs_sleep_logspace);
xlog_wait(&tic->t_wait, &log->l_grant_lock);
@@ -2666,9 +2665,7 @@ redo:
if (free_bytes < need_bytes) {
if (list_empty(&tic->t_queue))
list_add_tail(&tic->t_queue, &log->l_writeq);
- spin_unlock(&log->l_grant_lock);
- xlog_grant_push_ail(log->l_mp, need_bytes);
- spin_lock(&log->l_grant_lock);
+ xlog_grant_push_ail(log, need_bytes);
XFS_STATS_INC(xs_sleep_logspace);
trace_xfs_log_regrant_write_sleep2(log, tic);
--
1.7.2.3
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH 07/12] xfs: convert l_last_sync_lsn to an atomic variable
2010-12-13 4:44 xfs: grant lock scaling and removal V3 Dave Chinner
` (5 preceding siblings ...)
2010-12-13 4:44 ` [PATCH 06/12] xfs: make AIL tail pushing independent of the grant lock Dave Chinner
@ 2010-12-13 4:44 ` Dave Chinner
2010-12-13 4:44 ` [PATCH 08/12] xfs: convert l_tail_lsn " Dave Chinner
` (4 subsequent siblings)
11 siblings, 0 replies; 24+ messages in thread
From: Dave Chinner @ 2010-12-13 4:44 UTC (permalink / raw)
To: xfs
From: Dave Chinner <dchinner@redhat.com>
log->l_last_sync_lsn is updated in only one critical spot - log
buffer Io completion - and is protected by the grant lock here. This
requires the grant lock to be taken for every log buffer IO
completion. Converting the l_last_sync_lsn variable to an atomic64_t
means that we do not need to take the grant lock in log buffer IO
completion to update it.
This also removes the need for explicitly holding a spinlock to read
the l_last_sync_lsn on 32 bit platforms.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
fs/xfs/xfs_log.c | 55 +++++++++++++++++++++-------------------------
fs/xfs/xfs_log_priv.h | 9 ++++++-
fs/xfs/xfs_log_recover.c | 6 ++--
3 files changed, 36 insertions(+), 34 deletions(-)
diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
index d15bd5b..6ed6ee9 100644
--- a/fs/xfs/xfs_log.c
+++ b/fs/xfs/xfs_log.c
@@ -675,12 +675,8 @@ xfs_log_move_tail(xfs_mount_t *mp,
if (XLOG_FORCED_SHUTDOWN(log))
return;
- if (tail_lsn == 0) {
- /* needed since sync_lsn is 64 bits */
- spin_lock(&log->l_icloglock);
- tail_lsn = log->l_last_sync_lsn;
- spin_unlock(&log->l_icloglock);
- }
+ if (tail_lsn == 0)
+ tail_lsn = atomic64_read(&log->l_last_sync_lsn);
spin_lock(&log->l_grant_lock);
@@ -800,11 +796,9 @@ xlog_assign_tail_lsn(xfs_mount_t *mp)
tail_lsn = xfs_trans_ail_tail(mp->m_ail);
spin_lock(&log->l_grant_lock);
- if (tail_lsn != 0) {
- log->l_tail_lsn = tail_lsn;
- } else {
- tail_lsn = log->l_tail_lsn = log->l_last_sync_lsn;
- }
+ if (!tail_lsn)
+ tail_lsn = atomic64_read(&log->l_last_sync_lsn);
+ log->l_tail_lsn = tail_lsn;
spin_unlock(&log->l_grant_lock);
return tail_lsn;
@@ -1014,9 +1008,9 @@ xlog_alloc_log(xfs_mount_t *mp,
log->l_flags |= XLOG_ACTIVE_RECOVERY;
log->l_prev_block = -1;
- log->l_tail_lsn = xlog_assign_lsn(1, 0);
/* log->l_tail_lsn = 0x100000000LL; cycle = 1; current block = 0 */
- log->l_last_sync_lsn = log->l_tail_lsn;
+ log->l_tail_lsn = xlog_assign_lsn(1, 0);
+ atomic64_set(&log->l_last_sync_lsn, xlog_assign_lsn(1, 0));
log->l_curr_cycle = 1; /* 0 is bad since this is initial value */
xlog_assign_grant_head(&log->l_grant_reserve_head, 1, 0);
xlog_assign_grant_head(&log->l_grant_write_head, 1, 0);
@@ -1194,6 +1188,7 @@ xlog_grant_push_ail(
int need_bytes)
{
xfs_lsn_t threshold_lsn = 0;
+ xfs_lsn_t last_sync_lsn;
xfs_lsn_t tail_lsn;
int free_blocks;
int free_bytes;
@@ -1228,10 +1223,12 @@ xlog_grant_push_ail(
threshold_block);
/*
* Don't pass in an lsn greater than the lsn of the last
- * log record known to be on disk.
+ * log record known to be on disk. Use a snapshot of the last sync lsn
+ * so that it doesn't change between the compare and the set.
*/
- if (XFS_LSN_CMP(threshold_lsn, log->l_last_sync_lsn) > 0)
- threshold_lsn = log->l_last_sync_lsn;
+ last_sync_lsn = atomic64_read(&log->l_last_sync_lsn);
+ if (XFS_LSN_CMP(threshold_lsn, last_sync_lsn) > 0)
+ threshold_lsn = last_sync_lsn;
/*
* Get the transaction layer to kick the dirty buffers out to
@@ -2194,7 +2191,7 @@ xlog_state_do_callback(
lowest_lsn = xlog_get_lowest_lsn(log);
if (lowest_lsn &&
XFS_LSN_CMP(lowest_lsn,
- be64_to_cpu(iclog->ic_header.h_lsn)) < 0) {
+ be64_to_cpu(iclog->ic_header.h_lsn)) < 0) {
iclog = iclog->ic_next;
continue; /* Leave this iclog for
* another thread */
@@ -2202,23 +2199,21 @@ xlog_state_do_callback(
iclog->ic_state = XLOG_STATE_CALLBACK;
- spin_unlock(&log->l_icloglock);
- /* l_last_sync_lsn field protected by
- * l_grant_lock. Don't worry about iclog's lsn.
- * No one else can be here except us.
+ /*
+ * update the last_sync_lsn before we drop the
+ * icloglock to ensure we are the only one that
+ * can update it.
*/
- spin_lock(&log->l_grant_lock);
- ASSERT(XFS_LSN_CMP(log->l_last_sync_lsn,
- be64_to_cpu(iclog->ic_header.h_lsn)) <= 0);
- log->l_last_sync_lsn =
- be64_to_cpu(iclog->ic_header.h_lsn);
- spin_unlock(&log->l_grant_lock);
+ ASSERT(XFS_LSN_CMP(atomic64_read(&log->l_last_sync_lsn),
+ be64_to_cpu(iclog->ic_header.h_lsn)) <= 0);
+ atomic64_set(&log->l_last_sync_lsn,
+ be64_to_cpu(iclog->ic_header.h_lsn));
- } else {
- spin_unlock(&log->l_icloglock);
+ } else
ioerrors++;
- }
+
+ spin_unlock(&log->l_icloglock);
/*
* Keep processing entries in the callback list until
diff --git a/fs/xfs/xfs_log_priv.h b/fs/xfs/xfs_log_priv.h
index 84c5e79..129068c 100644
--- a/fs/xfs/xfs_log_priv.h
+++ b/fs/xfs/xfs_log_priv.h
@@ -508,7 +508,6 @@ typedef struct log {
spinlock_t l_icloglock; /* grab to change iclog state */
xfs_lsn_t l_tail_lsn; /* lsn of 1st LR with unflushed
* buffers */
- xfs_lsn_t l_last_sync_lsn;/* lsn of last LR on disk */
int l_curr_cycle; /* Cycle number of log writes */
int l_prev_cycle; /* Cycle number before last
* block increment */
@@ -522,6 +521,14 @@ typedef struct log {
int64_t l_grant_reserve_head;
int64_t l_grant_write_head;
+ /*
+ * l_last_sync_lsn is an atomic so it can be set and read without
+ * needing to hold specific locks. To avoid operations contending with
+ * other hot objects, place it on a separate cacheline.
+ */
+ /* lsn of last LR on disk */
+ atomic64_t l_last_sync_lsn ____cacheline_aligned_in_smp;
+
/* The following field are used for debugging; need to hold icloglock */
#ifdef DEBUG
char *l_iclog_bak[XLOG_MAX_ICLOGS];
diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index 3e91975..7e5b8ab 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -926,7 +926,7 @@ xlog_find_tail(
if (found == 2)
log->l_curr_cycle++;
log->l_tail_lsn = be64_to_cpu(rhead->h_tail_lsn);
- log->l_last_sync_lsn = be64_to_cpu(rhead->h_lsn);
+ atomic64_set(&log->l_last_sync_lsn, be64_to_cpu(rhead->h_lsn));
xlog_assign_grant_head(&log->l_grant_reserve_head, log->l_curr_cycle,
BBTOB(log->l_curr_block));
xlog_assign_grant_head(&log->l_grant_write_head, log->l_curr_cycle,
@@ -978,9 +978,9 @@ xlog_find_tail(
log->l_tail_lsn =
xlog_assign_lsn(log->l_curr_cycle,
after_umount_blk);
- log->l_last_sync_lsn =
+ atomic64_set(&log->l_last_sync_lsn,
xlog_assign_lsn(log->l_curr_cycle,
- after_umount_blk);
+ after_umount_blk));
*tail_blk = after_umount_blk;
/*
--
1.7.2.3
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH 08/12] xfs: convert l_tail_lsn to an atomic variable.
2010-12-13 4:44 xfs: grant lock scaling and removal V3 Dave Chinner
` (6 preceding siblings ...)
2010-12-13 4:44 ` [PATCH 07/12] xfs: convert l_last_sync_lsn to an atomic variable Dave Chinner
@ 2010-12-13 4:44 ` Dave Chinner
2010-12-20 12:06 ` Christoph Hellwig
2010-12-13 4:44 ` [PATCH 09/12] xfs: convert log grant heads to atomic variables Dave Chinner
` (3 subsequent siblings)
11 siblings, 1 reply; 24+ messages in thread
From: Dave Chinner @ 2010-12-13 4:44 UTC (permalink / raw)
To: xfs
From: Dave Chinner <dchinner@redhat.com>
log->l_tail_lsn is currently protected by the log grant lock. The
lock is only needed for serialising readers against writers, so we
don't really need the lock if we make the l_tail_lsn variable an
atomic. Converting the l_tail_lsn variable to an atomic64_t means we
can start to peel back the grant lock from various operations.
Also, provide functions to safely crack an atomic LSN variable into
it's component pieces and to recombined the components into an
atomic variable. Use them where appropriate.
This also removes the need for explicitly holding a spinlock to read
the l_tail_lsn on 32 bit platforms.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
fs/xfs/linux-2.6/xfs_trace.h | 2 +-
fs/xfs/xfs_log.c | 56 ++++++++++++++++++-----------------------
fs/xfs/xfs_log_priv.h | 37 +++++++++++++++++++++++----
fs/xfs/xfs_log_recover.c | 14 ++++------
4 files changed, 63 insertions(+), 46 deletions(-)
diff --git a/fs/xfs/linux-2.6/xfs_trace.h b/fs/xfs/linux-2.6/xfs_trace.h
index 82dfcd7..2f62d0f 100644
--- a/fs/xfs/linux-2.6/xfs_trace.h
+++ b/fs/xfs/linux-2.6/xfs_trace.h
@@ -794,7 +794,7 @@ DECLARE_EVENT_CLASS(xfs_loggrant_class,
&__entry->grant_write_bytes);
__entry->curr_cycle = log->l_curr_cycle;
__entry->curr_block = log->l_curr_block;
- __entry->tail_lsn = log->l_tail_lsn;
+ __entry->tail_lsn = atomic64_read(&log->l_tail_lsn);
),
TP_printk("dev %d:%d type %s t_ocnt %u t_cnt %u t_curr_res %u "
"t_unit_res %u t_flags %s reserve_headq 0x%p "
diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
index 6ed6ee9..57bd143 100644
--- a/fs/xfs/xfs_log.c
+++ b/fs/xfs/xfs_log.c
@@ -678,15 +678,11 @@ xfs_log_move_tail(xfs_mount_t *mp,
if (tail_lsn == 0)
tail_lsn = atomic64_read(&log->l_last_sync_lsn);
- spin_lock(&log->l_grant_lock);
-
- /* Also an invalid lsn. 1 implies that we aren't passing in a valid
- * tail_lsn.
- */
- if (tail_lsn != 1) {
- log->l_tail_lsn = tail_lsn;
- }
+ /* tail_lsn == 1 implies that we weren't passed a valid value. */
+ if (tail_lsn != 1)
+ atomic64_set(&log->l_tail_lsn, tail_lsn);
+ spin_lock(&log->l_grant_lock);
if (!list_empty(&log->l_writeq)) {
#ifdef DEBUG
if (log->l_flags & XLOG_ACTIVE_RECOVERY)
@@ -789,21 +785,19 @@ xfs_log_need_covered(xfs_mount_t *mp)
* We may be holding the log iclog lock upon entering this routine.
*/
xfs_lsn_t
-xlog_assign_tail_lsn(xfs_mount_t *mp)
+xlog_assign_tail_lsn(
+ struct xfs_mount *mp)
{
- xfs_lsn_t tail_lsn;
- xlog_t *log = mp->m_log;
+ xfs_lsn_t tail_lsn;
+ struct log *log = mp->m_log;
tail_lsn = xfs_trans_ail_tail(mp->m_ail);
- spin_lock(&log->l_grant_lock);
if (!tail_lsn)
tail_lsn = atomic64_read(&log->l_last_sync_lsn);
- log->l_tail_lsn = tail_lsn;
- spin_unlock(&log->l_grant_lock);
+ atomic64_set(&log->l_tail_lsn, tail_lsn);
return tail_lsn;
-} /* xlog_assign_tail_lsn */
-
+}
/*
* Return the space in the log between the tail and the head. The head
@@ -831,8 +825,8 @@ xlog_space_left(
int head_bytes;
xlog_crack_grant_head(head, &head_cycle, &head_bytes);
- tail_bytes = BBTOB(BLOCK_LSN(log->l_tail_lsn));
- tail_cycle = CYCLE_LSN(log->l_tail_lsn);
+ xlog_crack_atomic_lsn(&log->l_tail_lsn, &tail_cycle, &tail_bytes);
+ tail_bytes = BBTOB(tail_bytes);
if ((tail_cycle == head_cycle) && (head_bytes >= tail_bytes)) {
free_bytes = log->l_logsize - (head_bytes - tail_bytes);
} else if ((tail_cycle + 1) < head_cycle) {
@@ -1009,8 +1003,8 @@ xlog_alloc_log(xfs_mount_t *mp,
log->l_prev_block = -1;
/* log->l_tail_lsn = 0x100000000LL; cycle = 1; current block = 0 */
- log->l_tail_lsn = xlog_assign_lsn(1, 0);
- atomic64_set(&log->l_last_sync_lsn, xlog_assign_lsn(1, 0));
+ xlog_assign_atomic_lsn(&log->l_tail_lsn, 1, 0);
+ xlog_assign_atomic_lsn(&log->l_last_sync_lsn, 1, 0);
log->l_curr_cycle = 1; /* 0 is bad since this is initial value */
xlog_assign_grant_head(&log->l_grant_reserve_head, 1, 0);
xlog_assign_grant_head(&log->l_grant_write_head, 1, 0);
@@ -1189,7 +1183,6 @@ xlog_grant_push_ail(
{
xfs_lsn_t threshold_lsn = 0;
xfs_lsn_t last_sync_lsn;
- xfs_lsn_t tail_lsn;
int free_blocks;
int free_bytes;
int threshold_block;
@@ -1198,7 +1191,6 @@ xlog_grant_push_ail(
ASSERT(BTOBB(need_bytes) < log->l_logBBsize);
- tail_lsn = log->l_tail_lsn;
free_bytes = xlog_space_left(log, &log->l_grant_reserve_head);
free_blocks = BTOBBT(free_bytes);
@@ -1213,8 +1205,9 @@ xlog_grant_push_ail(
if (free_blocks >= free_threshold)
return;
- threshold_block = BLOCK_LSN(tail_lsn) + free_threshold;
- threshold_cycle = CYCLE_LSN(tail_lsn);
+ xlog_crack_atomic_lsn(&log->l_tail_lsn, &threshold_cycle,
+ &threshold_block);
+ threshold_block += free_threshold;
if (threshold_block >= log->l_logBBsize) {
threshold_block -= log->l_logBBsize;
threshold_cycle += 1;
@@ -2828,11 +2821,11 @@ xlog_state_release_iclog(
if (iclog->ic_state == XLOG_STATE_WANT_SYNC) {
/* update tail before writing to iclog */
- xlog_assign_tail_lsn(log->l_mp);
+ xfs_lsn_t tail_lsn = xlog_assign_tail_lsn(log->l_mp);
sync++;
iclog->ic_state = XLOG_STATE_SYNCING;
- iclog->ic_header.h_tail_lsn = cpu_to_be64(log->l_tail_lsn);
- xlog_verify_tail_lsn(log, iclog, log->l_tail_lsn);
+ iclog->ic_header.h_tail_lsn = cpu_to_be64(tail_lsn);
+ xlog_verify_tail_lsn(log, iclog, tail_lsn);
/* cycle incremented when incrementing curr_block */
}
spin_unlock(&log->l_icloglock);
@@ -3435,7 +3428,7 @@ STATIC void
xlog_verify_grant_tail(
struct log *log)
{
- xfs_lsn_t tail_lsn = log->l_tail_lsn;
+ int tail_cycle, tail_blocks;
int cycle, space;
/*
@@ -3445,9 +3438,10 @@ xlog_verify_grant_tail(
* check the byte count.
*/
xlog_crack_grant_head(&log->l_grant_write_head, &cycle, &space);
- if (CYCLE_LSN(tail_lsn) != cycle) {
- ASSERT(cycle - 1 == CYCLE_LSN(tail_lsn));
- ASSERT(space <= BBTOB(BLOCK_LSN(tail_lsn)));
+ xlog_crack_atomic_lsn(&log->l_tail_lsn, &tail_cycle, &tail_blocks);
+ if (tail_cycle != cycle) {
+ ASSERT(cycle - 1 == tail_cycle);
+ ASSERT(space <= BBTOB(tail_blocks));
}
}
diff --git a/fs/xfs/xfs_log_priv.h b/fs/xfs/xfs_log_priv.h
index 129068c..100a410 100644
--- a/fs/xfs/xfs_log_priv.h
+++ b/fs/xfs/xfs_log_priv.h
@@ -54,7 +54,6 @@ struct xfs_mount;
BTOBB(XLOG_MAX_ICLOGS << (xfs_sb_version_haslogv2(&log->l_mp->m_sb) ? \
XLOG_MAX_RECORD_BSHIFT : XLOG_BIG_RECORD_BSHIFT))
-
static inline xfs_lsn_t xlog_assign_lsn(uint cycle, uint block)
{
return ((xfs_lsn_t)cycle << 32) | block;
@@ -506,8 +505,6 @@ typedef struct log {
* log entries" */
xlog_in_core_t *l_iclog; /* head log queue */
spinlock_t l_icloglock; /* grab to change iclog state */
- xfs_lsn_t l_tail_lsn; /* lsn of 1st LR with unflushed
- * buffers */
int l_curr_cycle; /* Cycle number of log writes */
int l_prev_cycle; /* Cycle number before last
* block increment */
@@ -522,12 +519,15 @@ typedef struct log {
int64_t l_grant_write_head;
/*
- * l_last_sync_lsn is an atomic so it can be set and read without
- * needing to hold specific locks. To avoid operations contending with
- * other hot objects, place it on a separate cacheline.
+ * l_last_sync_lsn and l_tail_lsn are atomics so they can be set and
+ * read without needing to hold specific locks. To avoid operations
+ * contending with other hot objects, place each of them on a separate
+ * cacheline.
*/
/* lsn of last LR on disk */
atomic64_t l_last_sync_lsn ____cacheline_aligned_in_smp;
+ /* lsn of 1st LR with unflushed * buffers */
+ atomic64_t l_tail_lsn ____cacheline_aligned_in_smp;
/* The following field are used for debugging; need to hold icloglock */
#ifdef DEBUG
@@ -564,6 +564,31 @@ int xlog_write(struct log *log, struct xfs_log_vec *log_vector,
xlog_in_core_t **commit_iclog, uint flags);
/*
+ * When we crack an atomic LSN, we sample it first so that the value will not
+ * change while we are cracking it into the component values. This means we
+ * will always get consistent component values to work from. This should always
+ * be used to smaple and crack LSNs taht are stored and updated in atomic
+ * variables.
+ */
+static inline void
+xlog_crack_atomic_lsn(atomic64_t *lsn, uint *cycle, uint *block)
+{
+ xfs_lsn_t val = atomic64_read(lsn);
+
+ *cycle = CYCLE_LSN(val);
+ *block = BLOCK_LSN(val);
+}
+
+/*
+ * Calculate and assign a value to an atomic LSN variable from component pieces.
+ */
+static inline void
+xlog_assign_atomic_lsn(atomic64_t *lsn, uint cycle, uint block)
+{
+ atomic64_set(lsn, xlog_assign_lsn(cycle, block));
+}
+
+/*
* When we crack the grrant head, we sample it first so that the value will not
* change while we are cracking it into the component values. This means we
* will always get consistent component values to work from.
diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index 7e5b8ab..147fb34 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -925,7 +925,7 @@ xlog_find_tail(
log->l_curr_cycle = be32_to_cpu(rhead->h_cycle);
if (found == 2)
log->l_curr_cycle++;
- log->l_tail_lsn = be64_to_cpu(rhead->h_tail_lsn);
+ atomic64_set(&log->l_tail_lsn, be64_to_cpu(rhead->h_tail_lsn));
atomic64_set(&log->l_last_sync_lsn, be64_to_cpu(rhead->h_lsn));
xlog_assign_grant_head(&log->l_grant_reserve_head, log->l_curr_cycle,
BBTOB(log->l_curr_block));
@@ -960,7 +960,7 @@ xlog_find_tail(
}
after_umount_blk = (i + hblks + (int)
BTOBB(be32_to_cpu(rhead->h_len))) % log->l_logBBsize;
- tail_lsn = log->l_tail_lsn;
+ tail_lsn = atomic64_read(&log->l_tail_lsn);
if (*head_blk == after_umount_blk &&
be32_to_cpu(rhead->h_num_logops) == 1) {
umount_data_blk = (i + hblks) % log->l_logBBsize;
@@ -975,12 +975,10 @@ xlog_find_tail(
* log records will point recovery to after the
* current unmount record.
*/
- log->l_tail_lsn =
- xlog_assign_lsn(log->l_curr_cycle,
- after_umount_blk);
- atomic64_set(&log->l_last_sync_lsn,
- xlog_assign_lsn(log->l_curr_cycle,
- after_umount_blk));
+ xlog_assign_atomic_lsn(&log->l_tail_lsn,
+ log->l_curr_cycle, after_umount_blk);
+ xlog_assign_atomic_lsn(&log->l_last_sync_lsn,
+ log->l_curr_cycle, after_umount_blk);
*tail_blk = after_umount_blk;
/*
--
1.7.2.3
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH 09/12] xfs: convert log grant heads to atomic variables
2010-12-13 4:44 xfs: grant lock scaling and removal V3 Dave Chinner
` (7 preceding siblings ...)
2010-12-13 4:44 ` [PATCH 08/12] xfs: convert l_tail_lsn " Dave Chinner
@ 2010-12-13 4:44 ` Dave Chinner
2010-12-13 4:44 ` [PATCH 10/12] xfs: introduce new locks for the log grant ticket wait queues Dave Chinner
` (2 subsequent siblings)
11 siblings, 0 replies; 24+ messages in thread
From: Dave Chinner @ 2010-12-13 4:44 UTC (permalink / raw)
To: xfs
From: Dave Chinner <dchinner@redhat.com>
Convert the log grant heads to atomic64_t types in preparation for
converting the accounting algorithms to atomic operations. his patch
just converts the variables; the algorithmic changes are in a
separate patch for clarity.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
fs/xfs/xfs_log.c | 8 ++++----
fs/xfs/xfs_log_priv.h | 12 ++++++------
2 files changed, 10 insertions(+), 10 deletions(-)
diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
index 57bd143..cddfc6b 100644
--- a/fs/xfs/xfs_log.c
+++ b/fs/xfs/xfs_log.c
@@ -47,7 +47,7 @@ STATIC xlog_t * xlog_alloc_log(xfs_mount_t *mp,
xfs_buftarg_t *log_target,
xfs_daddr_t blk_offset,
int num_bblks);
-STATIC int xlog_space_left(struct log *log, int64_t *head);
+STATIC int xlog_space_left(struct log *log, atomic64_t *head);
STATIC int xlog_sync(xlog_t *log, xlog_in_core_t *iclog);
STATIC void xlog_dealloc_log(xlog_t *log);
@@ -100,7 +100,7 @@ STATIC int xlog_iclogs_empty(xlog_t *log);
static void
xlog_grant_sub_space(
struct log *log,
- int64_t *head,
+ atomic64_t *head,
int bytes)
{
int cycle, space;
@@ -119,7 +119,7 @@ xlog_grant_sub_space(
static void
xlog_grant_add_space(
struct log *log,
- int64_t *head,
+ atomic64_t *head,
int bytes)
{
int tmp;
@@ -816,7 +816,7 @@ xlog_assign_tail_lsn(
STATIC int
xlog_space_left(
struct log *log,
- int64_t *head)
+ atomic64_t *head)
{
int free_bytes;
int tail_bytes;
diff --git a/fs/xfs/xfs_log_priv.h b/fs/xfs/xfs_log_priv.h
index 100a410..24f510f 100644
--- a/fs/xfs/xfs_log_priv.h
+++ b/fs/xfs/xfs_log_priv.h
@@ -515,8 +515,8 @@ typedef struct log {
spinlock_t l_grant_lock ____cacheline_aligned_in_smp;
struct list_head l_reserveq;
struct list_head l_writeq;
- int64_t l_grant_reserve_head;
- int64_t l_grant_write_head;
+ atomic64_t l_grant_reserve_head;
+ atomic64_t l_grant_write_head;
/*
* l_last_sync_lsn and l_tail_lsn are atomics so they can be set and
@@ -594,18 +594,18 @@ xlog_assign_atomic_lsn(atomic64_t *lsn, uint cycle, uint block)
* will always get consistent component values to work from.
*/
static inline void
-xlog_crack_grant_head(int64_t *head, int *cycle, int *space)
+xlog_crack_grant_head(atomic64_t *head, int *cycle, int *space)
{
- int64_t val = *head;
+ int64_t val = atomic64_read(head);
*cycle = val >> 32;
*space = val & 0xffffffff;
}
static inline void
-xlog_assign_grant_head(int64_t *head, int cycle, int space)
+xlog_assign_grant_head(atomic64_t *head, int cycle, int space)
{
- *head = ((int64_t)cycle << 32) | space;
+ atomic64_set(head, ((int64_t)cycle << 32) | space);
}
/*
--
1.7.2.3
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH 10/12] xfs: introduce new locks for the log grant ticket wait queues
2010-12-13 4:44 xfs: grant lock scaling and removal V3 Dave Chinner
` (8 preceding siblings ...)
2010-12-13 4:44 ` [PATCH 09/12] xfs: convert log grant heads to atomic variables Dave Chinner
@ 2010-12-13 4:44 ` Dave Chinner
2010-12-20 11:49 ` Christoph Hellwig
2010-12-13 4:44 ` [PATCH 11/12] xfs: convert grant head manipulations to lockless algorithm Dave Chinner
2010-12-13 4:44 ` [PATCH 12/12] xfs: kill useless spinlock_destroy macro Dave Chinner
11 siblings, 1 reply; 24+ messages in thread
From: Dave Chinner @ 2010-12-13 4:44 UTC (permalink / raw)
To: xfs
From: Dave Chinner <dchinner@redhat.com>
The log grant ticket wait queues are currently protected by the log
grant lock. However, the queues are functionally independent from
each other, and operations on them only require serialisation
against other queue operations now that all of the other log
variables they use are atomic values.
Hence, we can make them independent of the grant lock by introducing
new locks just to protect the lists operations. because the lists
are independent, we can use a lock per list and ensure that reserve
and write head queuing do not contend.
To ensure forced shutdowns work correctly in conjunction with the
new fast paths, ensure that we check whether the log has been shut
down in the grant functions once we hold the relevant spin locks but
before we go to sleep. This is needed to co-ordinate correctly with
the wakeups that are issued on the ticket queues so we don't leave
any processes sleeping on the queues during a shutdown.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
fs/xfs/linux-2.6/xfs_trace.h | 2 +
fs/xfs/xfs_log.c | 139 +++++++++++++++++++++++++-----------------
fs/xfs/xfs_log_priv.h | 16 ++++-
3 files changed, 97 insertions(+), 60 deletions(-)
diff --git a/fs/xfs/linux-2.6/xfs_trace.h b/fs/xfs/linux-2.6/xfs_trace.h
index 2f62d0f..7fae8d5 100644
--- a/fs/xfs/linux-2.6/xfs_trace.h
+++ b/fs/xfs/linux-2.6/xfs_trace.h
@@ -837,6 +837,7 @@ DEFINE_LOGGRANT_EVENT(xfs_log_grant_sleep1);
DEFINE_LOGGRANT_EVENT(xfs_log_grant_wake1);
DEFINE_LOGGRANT_EVENT(xfs_log_grant_sleep2);
DEFINE_LOGGRANT_EVENT(xfs_log_grant_wake2);
+DEFINE_LOGGRANT_EVENT(xfs_log_grant_wake_up);
DEFINE_LOGGRANT_EVENT(xfs_log_regrant_write_enter);
DEFINE_LOGGRANT_EVENT(xfs_log_regrant_write_exit);
DEFINE_LOGGRANT_EVENT(xfs_log_regrant_write_error);
@@ -844,6 +845,7 @@ DEFINE_LOGGRANT_EVENT(xfs_log_regrant_write_sleep1);
DEFINE_LOGGRANT_EVENT(xfs_log_regrant_write_wake1);
DEFINE_LOGGRANT_EVENT(xfs_log_regrant_write_sleep2);
DEFINE_LOGGRANT_EVENT(xfs_log_regrant_write_wake2);
+DEFINE_LOGGRANT_EVENT(xfs_log_regrant_write_wake_up);
DEFINE_LOGGRANT_EVENT(xfs_log_regrant_reserve_enter);
DEFINE_LOGGRANT_EVENT(xfs_log_regrant_reserve_exit);
DEFINE_LOGGRANT_EVENT(xfs_log_regrant_reserve_sub);
diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
index cddfc6b..de3ad18 100644
--- a/fs/xfs/xfs_log.c
+++ b/fs/xfs/xfs_log.c
@@ -682,12 +682,12 @@ xfs_log_move_tail(xfs_mount_t *mp,
if (tail_lsn != 1)
atomic64_set(&log->l_tail_lsn, tail_lsn);
- spin_lock(&log->l_grant_lock);
- if (!list_empty(&log->l_writeq)) {
+ if (!list_empty_careful(&log->l_writeq)) {
#ifdef DEBUG
if (log->l_flags & XLOG_ACTIVE_RECOVERY)
panic("Recovery problem");
#endif
+ spin_lock(&log->l_grant_write_lock);
free_bytes = xlog_space_left(log, &log->l_grant_write_head);
list_for_each_entry(tic, &log->l_writeq, t_queue) {
ASSERT(tic->t_flags & XLOG_TIC_PERM_RESERV);
@@ -696,15 +696,18 @@ xfs_log_move_tail(xfs_mount_t *mp,
break;
tail_lsn = 0;
free_bytes -= tic->t_unit_res;
+ trace_xfs_log_regrant_write_wake_up(log, tic);
wake_up(&tic->t_wait);
}
+ spin_unlock(&log->l_grant_write_lock);
}
- if (!list_empty(&log->l_reserveq)) {
+ if (!list_empty_careful(&log->l_reserveq)) {
#ifdef DEBUG
if (log->l_flags & XLOG_ACTIVE_RECOVERY)
panic("Recovery problem");
#endif
+ spin_lock(&log->l_grant_reserve_lock);
free_bytes = xlog_space_left(log, &log->l_grant_reserve_head);
list_for_each_entry(tic, &log->l_reserveq, t_queue) {
if (tic->t_flags & XLOG_TIC_PERM_RESERV)
@@ -715,11 +718,12 @@ xfs_log_move_tail(xfs_mount_t *mp,
break;
tail_lsn = 0;
free_bytes -= need_bytes;
+ trace_xfs_log_grant_wake_up(log, tic);
wake_up(&tic->t_wait);
}
+ spin_unlock(&log->l_grant_reserve_lock);
}
- spin_unlock(&log->l_grant_lock);
-} /* xfs_log_move_tail */
+}
/*
* Determine if we have a transaction that has gone to disk
@@ -1010,6 +1014,8 @@ xlog_alloc_log(xfs_mount_t *mp,
xlog_assign_grant_head(&log->l_grant_write_head, 1, 0);
INIT_LIST_HEAD(&log->l_reserveq);
INIT_LIST_HEAD(&log->l_writeq);
+ spin_lock_init(&log->l_grant_reserve_lock);
+ spin_lock_init(&log->l_grant_write_lock);
error = EFSCORRUPTED;
if (xfs_sb_version_hassector(&mp->m_sb)) {
@@ -2477,6 +2483,18 @@ restart:
*
* Once a ticket gets put onto the reserveq, it will only return after
* the needed reservation is satisfied.
+ *
+ * This function is structured so that it has a lock free fast path. This is
+ * necessary because every new transaction reservation will come through this
+ * path. Hence any lock will be globally hot if we take it unconditionally on
+ * every pass.
+ *
+ * As tickets are only ever moved on and off the reserveq under the
+ * l_grant_reserve_lock, we only need to take that lock if we are going
+ * to add the ticket to the queue and sleep. We can avoid taking the lock if the
+ * ticket was never added to the reserveq because the t_queue list head will be
+ * empty and we hold the only reference to it so it can safely be checked
+ * unlocked.
*/
STATIC int
xlog_grant_log_space(xlog_t *log,
@@ -2490,13 +2508,20 @@ xlog_grant_log_space(xlog_t *log,
panic("grant Recovery problem");
#endif
- /* Is there space or do we need to sleep? */
- spin_lock(&log->l_grant_lock);
-
trace_xfs_log_grant_enter(log, tic);
+ need_bytes = tic->t_unit_res;
+ if (tic->t_flags & XFS_LOG_PERM_RESERV)
+ need_bytes *= tic->t_ocnt;
+
/* something is already sleeping; insert new transaction at end */
- if (!list_empty(&log->l_reserveq)) {
+ if (!list_empty_careful(&log->l_reserveq)) {
+ spin_lock(&log->l_grant_reserve_lock);
+ /* recheck the queue now we are locked */
+ if (list_empty(&log->l_reserveq)) {
+ spin_unlock(&log->l_grant_reserve_lock);
+ goto redo;
+ }
list_add_tail(&tic->t_queue, &log->l_reserveq);
trace_xfs_log_grant_sleep1(log, tic);
@@ -2509,48 +2534,47 @@ xlog_grant_log_space(xlog_t *log,
goto error_return;
XFS_STATS_INC(xs_sleep_logspace);
- xlog_wait(&tic->t_wait, &log->l_grant_lock);
+ xlog_wait(&tic->t_wait, &log->l_grant_reserve_lock);
/*
* If we got an error, and the filesystem is shutting down,
* we'll catch it down below. So just continue...
*/
trace_xfs_log_grant_wake1(log, tic);
- spin_lock(&log->l_grant_lock);
}
- if (tic->t_flags & XFS_LOG_PERM_RESERV)
- need_bytes = tic->t_unit_res*tic->t_ocnt;
- else
- need_bytes = tic->t_unit_res;
redo:
if (XLOG_FORCED_SHUTDOWN(log))
- goto error_return;
+ goto error_return_unlocked;
free_bytes = xlog_space_left(log, &log->l_grant_reserve_head);
if (free_bytes < need_bytes) {
+ spin_lock(&log->l_grant_reserve_lock);
if (list_empty(&tic->t_queue))
list_add_tail(&tic->t_queue, &log->l_reserveq);
trace_xfs_log_grant_sleep2(log, tic);
+ if (XLOG_FORCED_SHUTDOWN(log))
+ goto error_return;
+
xlog_grant_push_ail(log, need_bytes);
XFS_STATS_INC(xs_sleep_logspace);
- xlog_wait(&tic->t_wait, &log->l_grant_lock);
-
- spin_lock(&log->l_grant_lock);
- if (XLOG_FORCED_SHUTDOWN(log))
- goto error_return;
+ xlog_wait(&tic->t_wait, &log->l_grant_reserve_lock);
trace_xfs_log_grant_wake2(log, tic);
-
goto redo;
}
- list_del_init(&tic->t_queue);
+ if (!list_empty(&tic->t_queue)) {
+ spin_lock(&log->l_grant_reserve_lock);
+ list_del_init(&tic->t_queue);
+ spin_unlock(&log->l_grant_reserve_lock);
+ }
/* we've got enough space */
+ spin_lock(&log->l_grant_lock);
xlog_grant_add_space(log, &log->l_grant_reserve_head, need_bytes);
xlog_grant_add_space(log, &log->l_grant_write_head, need_bytes);
trace_xfs_log_grant_exit(log, tic);
@@ -2559,8 +2583,11 @@ redo:
spin_unlock(&log->l_grant_lock);
return 0;
- error_return:
+error_return_unlocked:
+ spin_lock(&log->l_grant_reserve_lock);
+error_return:
list_del_init(&tic->t_queue);
+ spin_unlock(&log->l_grant_reserve_lock);
trace_xfs_log_grant_error(log, tic);
/*
@@ -2570,7 +2597,6 @@ redo:
*/
tic->t_curr_res = 0;
tic->t_cnt = 0; /* ungrant will give back unit_res * t_cnt. */
- spin_unlock(&log->l_grant_lock);
return XFS_ERROR(EIO);
} /* xlog_grant_log_space */
@@ -2578,7 +2604,8 @@ redo:
/*
* Replenish the byte reservation required by moving the grant write head.
*
- *
+ * Similar to xlog_grant_log_space, the function is structured to have a lock
+ * free fast path.
*/
STATIC int
xlog_regrant_write_log_space(xlog_t *log,
@@ -2597,12 +2624,9 @@ xlog_regrant_write_log_space(xlog_t *log,
panic("regrant Recovery problem");
#endif
- spin_lock(&log->l_grant_lock);
-
trace_xfs_log_regrant_write_enter(log, tic);
-
if (XLOG_FORCED_SHUTDOWN(log))
- goto error_return;
+ goto error_return_unlocked;
/* If there are other waiters on the queue then give them a
* chance at logspace before us. Wake up the first waiters,
@@ -2611,8 +2635,10 @@ xlog_regrant_write_log_space(xlog_t *log,
* this transaction.
*/
need_bytes = tic->t_unit_res;
- if (!list_empty(&log->l_writeq)) {
+ if (!list_empty_careful(&log->l_writeq)) {
struct xlog_ticket *ntic;
+
+ spin_lock(&log->l_grant_write_lock);
free_bytes = xlog_space_left(log, &log->l_grant_write_head);
list_for_each_entry(ntic, &log->l_writeq, t_queue) {
ASSERT(ntic->t_flags & XLOG_TIC_PERM_RESERV);
@@ -2627,50 +2653,48 @@ xlog_regrant_write_log_space(xlog_t *log,
struct xlog_ticket, t_queue)) {
if (list_empty(&tic->t_queue))
list_add_tail(&tic->t_queue, &log->l_writeq);
-
trace_xfs_log_regrant_write_sleep1(log, tic);
xlog_grant_push_ail(log, need_bytes);
XFS_STATS_INC(xs_sleep_logspace);
- xlog_wait(&tic->t_wait, &log->l_grant_lock);
-
- /* If we're shutting down, this tic is already
- * off the queue */
- spin_lock(&log->l_grant_lock);
- if (XLOG_FORCED_SHUTDOWN(log))
- goto error_return;
-
+ xlog_wait(&tic->t_wait, &log->l_grant_write_lock);
trace_xfs_log_regrant_write_wake1(log, tic);
- }
+ } else
+ spin_unlock(&log->l_grant_write_lock);
}
redo:
if (XLOG_FORCED_SHUTDOWN(log))
- goto error_return;
+ goto error_return_unlocked;
free_bytes = xlog_space_left(log, &log->l_grant_write_head);
if (free_bytes < need_bytes) {
+ spin_lock(&log->l_grant_write_lock);
if (list_empty(&tic->t_queue))
list_add_tail(&tic->t_queue, &log->l_writeq);
+
+ if (XLOG_FORCED_SHUTDOWN(log))
+ goto error_return;
+
xlog_grant_push_ail(log, need_bytes);
XFS_STATS_INC(xs_sleep_logspace);
trace_xfs_log_regrant_write_sleep2(log, tic);
- xlog_wait(&tic->t_wait, &log->l_grant_lock);
-
- /* If we're shutting down, this tic is already off the queue */
- spin_lock(&log->l_grant_lock);
- if (XLOG_FORCED_SHUTDOWN(log))
- goto error_return;
+ xlog_wait(&tic->t_wait, &log->l_grant_write_lock);
trace_xfs_log_regrant_write_wake2(log, tic);
goto redo;
}
- list_del_init(&tic->t_queue);
+ if (!list_empty(&tic->t_queue)) {
+ spin_lock(&log->l_grant_write_lock);
+ list_del_init(&tic->t_queue);
+ spin_unlock(&log->l_grant_write_lock);
+ }
/* we've got enough space */
+ spin_lock(&log->l_grant_lock);
xlog_grant_add_space(log, &log->l_grant_write_head, need_bytes);
trace_xfs_log_regrant_write_exit(log, tic);
xlog_verify_grant_head(log, 1);
@@ -2679,8 +2703,11 @@ redo:
return 0;
+ error_return_unlocked:
+ spin_lock(&log->l_grant_write_lock);
error_return:
list_del_init(&tic->t_queue);
+ spin_unlock(&log->l_grant_write_lock);
trace_xfs_log_regrant_write_error(log, tic);
/*
@@ -2690,7 +2717,6 @@ redo:
*/
tic->t_curr_res = 0;
tic->t_cnt = 0; /* ungrant will give back unit_res * t_cnt. */
- spin_unlock(&log->l_grant_lock);
return XFS_ERROR(EIO);
} /* xlog_regrant_write_log_space */
@@ -3664,12 +3690,10 @@ xfs_log_force_umount(
xlog_cil_force(log);
/*
- * We must hold both the GRANT lock and the LOG lock,
- * before we mark the filesystem SHUTDOWN and wake
- * everybody up to tell the bad news.
+ * mark the filesystem and the as in a shutdown state and wake
+ * everybody up to tell them the bad news.
*/
spin_lock(&log->l_icloglock);
- spin_lock(&log->l_grant_lock);
mp->m_flags |= XFS_MOUNT_FS_SHUTDOWN;
if (mp->m_sb_bp)
XFS_BUF_DONE(mp->m_sb_bp);
@@ -3694,14 +3718,17 @@ xfs_log_force_umount(
* means we have to wake up everybody queued up on reserveq as well as
* writeq. In addition, we make sure in xlog_{re}grant_log_space that
* we don't enqueue anything once the SHUTDOWN flag is set, and this
- * action is protected by the GRANTLOCK.
+ * action is protected by the grant locks.
*/
+ spin_lock(&log->l_grant_reserve_lock);
list_for_each_entry(tic, &log->l_reserveq, t_queue)
wake_up(&tic->t_wait);
+ spin_unlock(&log->l_grant_reserve_lock);
+ spin_lock(&log->l_grant_write_lock);
list_for_each_entry(tic, &log->l_writeq, t_queue)
wake_up(&tic->t_wait);
- spin_unlock(&log->l_grant_lock);
+ spin_unlock(&log->l_grant_write_lock);
if (!(log->l_iclog->ic_state & XLOG_STATE_IOERROR)) {
ASSERT(!logerror);
diff --git a/fs/xfs/xfs_log_priv.h b/fs/xfs/xfs_log_priv.h
index 24f510f..c8e23ff 100644
--- a/fs/xfs/xfs_log_priv.h
+++ b/fs/xfs/xfs_log_priv.h
@@ -513,10 +513,6 @@ typedef struct log {
/* The following block of fields are changed while holding grant_lock */
spinlock_t l_grant_lock ____cacheline_aligned_in_smp;
- struct list_head l_reserveq;
- struct list_head l_writeq;
- atomic64_t l_grant_reserve_head;
- atomic64_t l_grant_write_head;
/*
* l_last_sync_lsn and l_tail_lsn are atomics so they can be set and
@@ -529,6 +525,18 @@ typedef struct log {
/* lsn of 1st LR with unflushed * buffers */
atomic64_t l_tail_lsn ____cacheline_aligned_in_smp;
+ /*
+ * ticket grant locks, queues and accounting have their own cachlines
+ * as these are quite hot and can be operated on concurrently.
+ */
+ spinlock_t l_grant_reserve_lock ____cacheline_aligned_in_smp;
+ struct list_head l_reserveq;
+ atomic64_t l_grant_reserve_head;
+
+ spinlock_t l_grant_write_lock ____cacheline_aligned_in_smp;
+ struct list_head l_writeq;
+ atomic64_t l_grant_write_head;
+
/* The following field are used for debugging; need to hold icloglock */
#ifdef DEBUG
char *l_iclog_bak[XLOG_MAX_ICLOGS];
--
1.7.2.3
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH 11/12] xfs: convert grant head manipulations to lockless algorithm
2010-12-13 4:44 xfs: grant lock scaling and removal V3 Dave Chinner
` (9 preceding siblings ...)
2010-12-13 4:44 ` [PATCH 10/12] xfs: introduce new locks for the log grant ticket wait queues Dave Chinner
@ 2010-12-13 4:44 ` Dave Chinner
2010-12-20 11:50 ` Christoph Hellwig
2010-12-13 4:44 ` [PATCH 12/12] xfs: kill useless spinlock_destroy macro Dave Chinner
11 siblings, 1 reply; 24+ messages in thread
From: Dave Chinner @ 2010-12-13 4:44 UTC (permalink / raw)
To: xfs
From: Dave Chinner <dchinner@redhat.com>
The only thing that the grant lock remains to protect is the grant head
manipulations when adding or removing space from the log. These calculations
are already based on atomic variables, so we can already update them safely
without locks. However, the grant head manpulations require atomic multi-step
calculations to be executed, which the algorithms currently don't allow.
To make these multi-step calculations atomic, convert the algorithms to
compare-and-exchange loops on the atomic variables. That is, we sample the old
value, perform the calculation and use atomic64_cmpxchg() to attempt to update
the head with the new value. If the head has not changed since we sampled it,
it will succeed and we are done. Otherwise, we rerun the calculation again from
a new sample of the head.
This allows us to remove the grant lock from around all the grant head space
manipulations, and that effectively removes the grant lock from the log
completely. Hence we can remove the grant lock completely from the log at this
point.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
fs/xfs/xfs_log.c | 103 ++++++++++++++++---------------------------------
fs/xfs/xfs_log_priv.h | 23 +++++++----
2 files changed, 49 insertions(+), 77 deletions(-)
diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
index de3ad18..58326ba 100644
--- a/fs/xfs/xfs_log.c
+++ b/fs/xfs/xfs_log.c
@@ -81,7 +81,6 @@ STATIC void xlog_ungrant_log_space(xlog_t *log,
#if defined(DEBUG)
STATIC void xlog_verify_dest_ptr(xlog_t *log, char *ptr);
-STATIC void xlog_verify_grant_head(xlog_t *log, int equals);
STATIC void xlog_verify_grant_tail(struct log *log);
STATIC void xlog_verify_iclog(xlog_t *log, xlog_in_core_t *iclog,
int count, boolean_t syncing);
@@ -89,7 +88,6 @@ STATIC void xlog_verify_tail_lsn(xlog_t *log, xlog_in_core_t *iclog,
xfs_lsn_t tail_lsn);
#else
#define xlog_verify_dest_ptr(a,b)
-#define xlog_verify_grant_head(a,b)
#define xlog_verify_grant_tail(a)
#define xlog_verify_iclog(a,b,c,d)
#define xlog_verify_tail_lsn(a,b,c)
@@ -103,17 +101,24 @@ xlog_grant_sub_space(
atomic64_t *head,
int bytes)
{
- int cycle, space;
+ int64_t head_val = atomic64_read(head);
+ int64_t new, old;
- xlog_crack_grant_head(head, &cycle, &space);
+ do {
+ int cycle, space;
- space -= bytes;
- if (space < 0) {
- space += log->l_logsize;
- cycle--;
- }
+ xlog_crack_grant_head_val(head_val, &cycle, &space);
- xlog_assign_grant_head(head, cycle, space);
+ space -= bytes;
+ if (space < 0) {
+ space += log->l_logsize;
+ cycle--;
+ }
+
+ old = head_val;
+ new = xlog_assign_grant_head_val(cycle, space);
+ head_val = atomic64_cmpxchg(head, old, new);
+ } while (head_val != old);
}
static void
@@ -122,20 +127,27 @@ xlog_grant_add_space(
atomic64_t *head,
int bytes)
{
- int tmp;
- int cycle, space;
+ int64_t head_val = atomic64_read(head);
+ int64_t new, old;
- xlog_crack_grant_head(head, &cycle, &space);
+ do {
+ int tmp;
+ int cycle, space;
- tmp = log->l_logsize - space;
- if (tmp > bytes)
- space += bytes;
- else {
- space = bytes - tmp;
- cycle++;
- }
+ xlog_crack_grant_head_val(head_val, &cycle, &space);
- xlog_assign_grant_head(head, cycle, space);
+ tmp = log->l_logsize - space;
+ if (tmp > bytes)
+ space += bytes;
+ else {
+ space = bytes - tmp;
+ cycle++;
+ }
+
+ old = head_val;
+ new = xlog_assign_grant_head_val(cycle, space);
+ head_val = atomic64_cmpxchg(head, old, new);
+ } while (head_val != old);
}
static void
@@ -318,9 +330,7 @@ xfs_log_reserve(
trace_xfs_log_reserve(log, internal_ticket);
- spin_lock(&log->l_grant_lock);
xlog_grant_push_ail(log, internal_ticket->t_unit_res);
- spin_unlock(&log->l_grant_lock);
retval = xlog_regrant_write_log_space(log, internal_ticket);
} else {
/* may sleep if need to allocate more tickets */
@@ -334,11 +344,9 @@ xfs_log_reserve(
trace_xfs_log_reserve(log, internal_ticket);
- spin_lock(&log->l_grant_lock);
xlog_grant_push_ail(log,
(internal_ticket->t_unit_res *
internal_ticket->t_cnt));
- spin_unlock(&log->l_grant_lock);
retval = xlog_grant_log_space(log, internal_ticket);
}
@@ -1057,7 +1065,6 @@ xlog_alloc_log(xfs_mount_t *mp,
log->l_xbuf = bp;
spin_lock_init(&log->l_icloglock);
- spin_lock_init(&log->l_grant_lock);
init_waitqueue_head(&log->l_flush_wait);
/* log record size must be multiple of BBSIZE; see xlog_rec_header_t */
@@ -1135,7 +1142,6 @@ out_free_iclog:
kmem_free(iclog);
}
spinlock_destroy(&log->l_icloglock);
- spinlock_destroy(&log->l_grant_lock);
xfs_buf_free(log->l_xbuf);
out_free_log:
kmem_free(log);
@@ -1331,10 +1337,8 @@ xlog_sync(xlog_t *log,
roundoff < BBTOB(1)));
/* move grant heads by roundoff in sync */
- spin_lock(&log->l_grant_lock);
xlog_grant_add_space(log, &log->l_grant_reserve_head, roundoff);
xlog_grant_add_space(log, &log->l_grant_write_head, roundoff);
- spin_unlock(&log->l_grant_lock);
/* put cycle number in every block */
xlog_pack_data(log, iclog, roundoff);
@@ -1455,7 +1459,6 @@ xlog_dealloc_log(xlog_t *log)
iclog = next_iclog;
}
spinlock_destroy(&log->l_icloglock);
- spinlock_destroy(&log->l_grant_lock);
xfs_buf_free(log->l_xbuf);
log->l_mp->m_log = NULL;
@@ -2574,13 +2577,10 @@ redo:
}
/* we've got enough space */
- spin_lock(&log->l_grant_lock);
xlog_grant_add_space(log, &log->l_grant_reserve_head, need_bytes);
xlog_grant_add_space(log, &log->l_grant_write_head, need_bytes);
trace_xfs_log_grant_exit(log, tic);
- xlog_verify_grant_head(log, 1);
xlog_verify_grant_tail(log);
- spin_unlock(&log->l_grant_lock);
return 0;
error_return_unlocked:
@@ -2694,12 +2694,9 @@ redo:
}
/* we've got enough space */
- spin_lock(&log->l_grant_lock);
xlog_grant_add_space(log, &log->l_grant_write_head, need_bytes);
trace_xfs_log_regrant_write_exit(log, tic);
- xlog_verify_grant_head(log, 1);
xlog_verify_grant_tail(log);
- spin_unlock(&log->l_grant_lock);
return 0;
@@ -2737,7 +2734,6 @@ xlog_regrant_reserve_log_space(xlog_t *log,
if (ticket->t_cnt > 0)
ticket->t_cnt--;
- spin_lock(&log->l_grant_lock);
xlog_grant_sub_space(log, &log->l_grant_reserve_head,
ticket->t_curr_res);
xlog_grant_sub_space(log, &log->l_grant_write_head,
@@ -2747,21 +2743,15 @@ xlog_regrant_reserve_log_space(xlog_t *log,
trace_xfs_log_regrant_reserve_sub(log, ticket);
- xlog_verify_grant_head(log, 1);
-
/* just return if we still have some of the pre-reserved space */
- if (ticket->t_cnt > 0) {
- spin_unlock(&log->l_grant_lock);
+ if (ticket->t_cnt > 0)
return;
- }
xlog_grant_add_space(log, &log->l_grant_reserve_head,
ticket->t_unit_res);
trace_xfs_log_regrant_reserve_exit(log, ticket);
- xlog_verify_grant_head(log, 0);
- spin_unlock(&log->l_grant_lock);
ticket->t_curr_res = ticket->t_unit_res;
xlog_tic_reset_res(ticket);
} /* xlog_regrant_reserve_log_space */
@@ -2790,7 +2780,6 @@ xlog_ungrant_log_space(xlog_t *log,
if (ticket->t_cnt > 0)
ticket->t_cnt--;
- spin_lock(&log->l_grant_lock);
trace_xfs_log_ungrant_enter(log, ticket);
trace_xfs_log_ungrant_sub(log, ticket);
@@ -2809,8 +2798,6 @@ xlog_ungrant_log_space(xlog_t *log,
trace_xfs_log_ungrant_exit(log, ticket);
- xlog_verify_grant_head(log, 1);
- spin_unlock(&log->l_grant_lock);
xfs_log_move_tail(log->l_mp, 1);
} /* xlog_ungrant_log_space */
@@ -3429,28 +3416,6 @@ xlog_verify_dest_ptr(
}
STATIC void
-xlog_verify_grant_head(xlog_t *log, int equals)
-{
- int reserve_cycle, reserve_space;
- int write_cycle, write_space;
-
- xlog_crack_grant_head(&log->l_grant_reserve_head,
- &reserve_cycle, &reserve_space);
- xlog_crack_grant_head(&log->l_grant_write_head,
- &write_cycle, &write_space);
-
- if (reserve_cycle == write_cycle) {
- if (equals)
- ASSERT(reserve_space >= write_space);
- else
- ASSERT(reserve_space > write_space);
- } else {
- ASSERT(reserve_cycle - 1 == write_cycle);
- ASSERT(write_space >= reserve_space);
- }
-}
-
-STATIC void
xlog_verify_grant_tail(
struct log *log)
{
diff --git a/fs/xfs/xfs_log_priv.h b/fs/xfs/xfs_log_priv.h
index c8e23ff..3b59d61 100644
--- a/fs/xfs/xfs_log_priv.h
+++ b/fs/xfs/xfs_log_priv.h
@@ -511,9 +511,6 @@ typedef struct log {
int l_curr_block; /* current logical log block */
int l_prev_block; /* previous logical log block */
- /* The following block of fields are changed while holding grant_lock */
- spinlock_t l_grant_lock ____cacheline_aligned_in_smp;
-
/*
* l_last_sync_lsn and l_tail_lsn are atomics so they can be set and
* read without needing to hold specific locks. To avoid operations
@@ -597,23 +594,33 @@ xlog_assign_atomic_lsn(atomic64_t *lsn, uint cycle, uint block)
}
/*
- * When we crack the grrant head, we sample it first so that the value will not
+ * When we crack the grant head, we sample it first so that the value will not
* change while we are cracking it into the component values. This means we
* will always get consistent component values to work from.
*/
static inline void
-xlog_crack_grant_head(atomic64_t *head, int *cycle, int *space)
+xlog_crack_grant_head_val(int64_t val, int *cycle, int *space)
{
- int64_t val = atomic64_read(head);
-
*cycle = val >> 32;
*space = val & 0xffffffff;
}
static inline void
+xlog_crack_grant_head(atomic64_t *head, int *cycle, int *space)
+{
+ xlog_crack_grant_head_val(atomic64_read(head), cycle, space);
+}
+
+static inline int64_t
+xlog_assign_grant_head_val(int cycle, int space)
+{
+ return ((int64_t)cycle << 32) | space;
+}
+
+static inline void
xlog_assign_grant_head(atomic64_t *head, int cycle, int space)
{
- atomic64_set(head, ((int64_t)cycle << 32) | space);
+ atomic64_set(head, xlog_assign_grant_head_val(cycle, space));
}
/*
--
1.7.2.3
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH 12/12] xfs: kill useless spinlock_destroy macro
2010-12-13 4:44 xfs: grant lock scaling and removal V3 Dave Chinner
` (10 preceding siblings ...)
2010-12-13 4:44 ` [PATCH 11/12] xfs: convert grant head manipulations to lockless algorithm Dave Chinner
@ 2010-12-13 4:44 ` Dave Chinner
11 siblings, 0 replies; 24+ messages in thread
From: Dave Chinner @ 2010-12-13 4:44 UTC (permalink / raw)
To: xfs
From: Dave Chinner <dchinner@redhat.com>
It is only used in 2 places in the log code, and is an empty macro,
so get rid of it.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
fs/xfs/linux-2.6/xfs_linux.h | 2 --
fs/xfs/xfs_log.c | 2 --
2 files changed, 0 insertions(+), 4 deletions(-)
diff --git a/fs/xfs/linux-2.6/xfs_linux.h b/fs/xfs/linux-2.6/xfs_linux.h
index ccebd86..e7cfa27 100644
--- a/fs/xfs/linux-2.6/xfs_linux.h
+++ b/fs/xfs/linux-2.6/xfs_linux.h
@@ -113,8 +113,6 @@
#define current_restore_flags_nested(sp, f) \
(current->flags = ((current->flags & ~(f)) | (*(sp) & (f))))
-#define spinlock_destroy(lock)
-
#define NBBY 8 /* number of bits per byte */
/*
diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
index 58326ba..7fadc36 100644
--- a/fs/xfs/xfs_log.c
+++ b/fs/xfs/xfs_log.c
@@ -1141,7 +1141,6 @@ out_free_iclog:
xfs_buf_free(iclog->ic_bp);
kmem_free(iclog);
}
- spinlock_destroy(&log->l_icloglock);
xfs_buf_free(log->l_xbuf);
out_free_log:
kmem_free(log);
@@ -1458,7 +1457,6 @@ xlog_dealloc_log(xlog_t *log)
kmem_free(iclog);
iclog = next_iclog;
}
- spinlock_destroy(&log->l_icloglock);
xfs_buf_free(log->l_xbuf);
log->l_mp->m_log = NULL;
--
1.7.2.3
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH 01/12] xfs: convert log grant ticket queues to list heads
2010-12-13 4:44 ` [PATCH 01/12] xfs: convert log grant ticket queues to list heads Dave Chinner
@ 2010-12-20 11:34 ` Christoph Hellwig
2010-12-21 0:55 ` Dave Chinner
0 siblings, 1 reply; 24+ messages in thread
From: Christoph Hellwig @ 2010-12-20 11:34 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs
On Mon, Dec 13, 2010 at 03:44:32PM +1100, Dave Chinner wrote:
> - __field(void *, reserve_headq)
> - __field(void *, write_headq)
> + __field(void *, reserveq)
> + __field(void *, writeq)
To repeat my question from the last review: what's the point in logging
this at all? There's not much we can do with it from trace-cmd / perf
output. What might be more useful is a list_empty() boolean flag.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 03/12] xfs: rework log grant space calculations
2010-12-13 4:44 ` [PATCH 03/12] xfs: rework log grant space calculations Dave Chinner
@ 2010-12-20 11:37 ` Christoph Hellwig
0 siblings, 0 replies; 24+ messages in thread
From: Christoph Hellwig @ 2010-12-20 11:37 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs
Looks good,
Reviewed-by: Christoph Hellwig <hch@lst.de>
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 04/12] xfs: combine grant heads into a single 64 bit integer
2010-12-13 4:44 ` [PATCH 04/12] xfs: combine grant heads into a single 64 bit integer Dave Chinner
@ 2010-12-20 11:40 ` Christoph Hellwig
0 siblings, 0 replies; 24+ messages in thread
From: Christoph Hellwig @ 2010-12-20 11:40 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs
On Mon, Dec 13, 2010 at 03:44:35PM +1100, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
>
> Prepare for switching the grant heads to atomic variables by
> combining the two 32 bit values that make up the grant head into a
> single 64 bit variable. Provide wrapper functions to combine and
> split the grant heads appropriately for calculations and use them as
> necessary.
>
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
Looks good,
Reviewed-by: Christoph Hellwig <hch@lst.de>
Minor nitpick:
> + if ((tail_cycle == head_cycle) && (head_bytes >= tail_bytes)) {
> + free_bytes = log->l_logsize - (head_bytes - tail_bytes);
> + } else if ((tail_cycle + 1) < head_cycle) {
Lots of superflous braces in the if statements.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 05/12] xfs: use wait queues directly for the log wait queues
2010-12-13 4:44 ` [PATCH 05/12] xfs: use wait queues directly for the log wait queues Dave Chinner
@ 2010-12-20 11:40 ` Christoph Hellwig
0 siblings, 0 replies; 24+ messages in thread
From: Christoph Hellwig @ 2010-12-20 11:40 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs
On Mon, Dec 13, 2010 at 03:44:36PM +1100, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
>
> The log grant queues are one of the few places left using sv_t
> constructs for waiting. Given we are touching this code, we should
> convert them to plain wait queues. While there, convert all the
> other sv_t users in the log code as well.
>
> Seeing as this removes the last users of the sv_t type, remove the
> header file defining the wrapper and the fragments that still
> reference it.
Looks good,
Reviewed-by: Christoph Hellwig <hch@lst.de>
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 06/12] xfs: make AIL tail pushing independent of the grant lock
2010-12-13 4:44 ` [PATCH 06/12] xfs: make AIL tail pushing independent of the grant lock Dave Chinner
@ 2010-12-20 11:41 ` Christoph Hellwig
0 siblings, 0 replies; 24+ messages in thread
From: Christoph Hellwig @ 2010-12-20 11:41 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs
Looks good,
Reviewed-by: Christoph Hellwig <hch@lst.de>
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 10/12] xfs: introduce new locks for the log grant ticket wait queues
2010-12-13 4:44 ` [PATCH 10/12] xfs: introduce new locks for the log grant ticket wait queues Dave Chinner
@ 2010-12-20 11:49 ` Christoph Hellwig
0 siblings, 0 replies; 24+ messages in thread
From: Christoph Hellwig @ 2010-12-20 11:49 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs
On Mon, Dec 13, 2010 at 03:44:41PM +1100, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
>
> The log grant ticket wait queues are currently protected by the log
> grant lock. However, the queues are functionally independent from
> each other, and operations on them only require serialisation
> against other queue operations now that all of the other log
> variables they use are atomic values.
>
> Hence, we can make them independent of the grant lock by introducing
> new locks just to protect the lists operations. because the lists
> are independent, we can use a lock per list and ensure that reserve
> and write head queuing do not contend.
>
> To ensure forced shutdowns work correctly in conjunction with the
> new fast paths, ensure that we check whether the log has been shut
> down in the grant functions once we hold the relevant spin locks but
> before we go to sleep. This is needed to co-ordinate correctly with
> the wakeups that are issued on the ticket queues so we don't leave
> any processes sleeping on the queues during a shutdown.
>
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
Looks good,
Reviewed-by: Christoph Hellwig <hch@lst.de>
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 11/12] xfs: convert grant head manipulations to lockless algorithm
2010-12-13 4:44 ` [PATCH 11/12] xfs: convert grant head manipulations to lockless algorithm Dave Chinner
@ 2010-12-20 11:50 ` Christoph Hellwig
0 siblings, 0 replies; 24+ messages in thread
From: Christoph Hellwig @ 2010-12-20 11:50 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs
Looks good,
Reviewed-by: Christoph Hellwig <hch@lst.de>
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 02/12] xfs: fact out common grant head/log tail verification code
2010-12-13 4:44 ` [PATCH 02/12] xfs: fact out common grant head/log tail verification code Dave Chinner
@ 2010-12-20 11:51 ` Christoph Hellwig
0 siblings, 0 replies; 24+ messages in thread
From: Christoph Hellwig @ 2010-12-20 11:51 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs
On Mon, Dec 13, 2010 at 03:44:33PM +1100, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
>
> Factor repeated debug code out of grant head manipulation functions into a
> separate function. This removes ifdef DEBUG spagetti from the code and makes
> the code easier to follow.
Looks good,
Reviewed-by: Christoph Hellwig <hch@lst.de>
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 08/12] xfs: convert l_tail_lsn to an atomic variable.
2010-12-13 4:44 ` [PATCH 08/12] xfs: convert l_tail_lsn " Dave Chinner
@ 2010-12-20 12:06 ` Christoph Hellwig
2010-12-21 1:27 ` Dave Chinner
0 siblings, 1 reply; 24+ messages in thread
From: Christoph Hellwig @ 2010-12-20 12:06 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs
On Mon, Dec 13, 2010 at 03:44:39PM +1100, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
>
> log->l_tail_lsn is currently protected by the log grant lock. The
> lock is only needed for serialising readers against writers, so we
> don't really need the lock if we make the l_tail_lsn variable an
> atomic. Converting the l_tail_lsn variable to an atomic64_t means we
> can start to peel back the grant lock from various operations.
>
> Also, provide functions to safely crack an atomic LSN variable into
> it's component pieces and to recombined the components into an
> atomic variable. Use them where appropriate.
>
> This also removes the need for explicitly holding a spinlock to read
> the l_tail_lsn on 32 bit platforms.
I know I suggested this, but I think the atomic read of l_tail_lsn
in xlog_space_left might be problemetic for the call from
xlog_grant_push_ail, where we read it twice now. Maybe split
xlog_space_left into a __xlog_space_left that gets the already cracked
values, and xlog_space_left as a wrapper around it?
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 01/12] xfs: convert log grant ticket queues to list heads
2010-12-20 11:34 ` Christoph Hellwig
@ 2010-12-21 0:55 ` Dave Chinner
0 siblings, 0 replies; 24+ messages in thread
From: Dave Chinner @ 2010-12-21 0:55 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: xfs
On Mon, Dec 20, 2010 at 06:34:58AM -0500, Christoph Hellwig wrote:
> On Mon, Dec 13, 2010 at 03:44:32PM +1100, Dave Chinner wrote:
> > - __field(void *, reserve_headq)
> > - __field(void *, write_headq)
> > + __field(void *, reserveq)
> > + __field(void *, writeq)
>
> To repeat my question from the last review: what's the point in logging
> this at all? There's not much we can do with it from trace-cmd / perf
> output. What might be more useful is a list_empty() boolean flag.
Oh, I missed that when running through all the comments. Thanks for
pointing it out again - I'll fix it this time.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 08/12] xfs: convert l_tail_lsn to an atomic variable.
2010-12-20 12:06 ` Christoph Hellwig
@ 2010-12-21 1:27 ` Dave Chinner
0 siblings, 0 replies; 24+ messages in thread
From: Dave Chinner @ 2010-12-21 1:27 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: xfs
On Mon, Dec 20, 2010 at 07:06:34AM -0500, Christoph Hellwig wrote:
> On Mon, Dec 13, 2010 at 03:44:39PM +1100, Dave Chinner wrote:
> > From: Dave Chinner <dchinner@redhat.com>
> >
> > log->l_tail_lsn is currently protected by the log grant lock. The
> > lock is only needed for serialising readers against writers, so we
> > don't really need the lock if we make the l_tail_lsn variable an
> > atomic. Converting the l_tail_lsn variable to an atomic64_t means we
> > can start to peel back the grant lock from various operations.
> >
> > Also, provide functions to safely crack an atomic LSN variable into
> > it's component pieces and to recombined the components into an
> > atomic variable. Use them where appropriate.
> >
> > This also removes the need for explicitly holding a spinlock to read
> > the l_tail_lsn on 32 bit platforms.
>
> I know I suggested this, but I think the atomic read of l_tail_lsn
> in xlog_space_left might be problemetic for the call from
> xlog_grant_push_ail, where we read it twice now. Maybe split
> xlog_space_left into a __xlog_space_left that gets the already cracked
> values, and xlog_space_left as a wrapper around it?
I'd convinced myself that it wouldn't be a problem. That is,
once we have a value for the tail_lsn in xlog_grant_push_ail(), the
threshold that we will push to is effectively fixed. The only thing
that will change is the amount of log space currently available,
which can only increase if the tail moves.
Hence we'll either get:
a) not enough log space and have to push, in which case the
value of the tail lsn seen in xlog_space_left() is
irrelevant to the threshold lsn we calculate, or
b) we'll have enough log space and not need to push in which
case we don't need to use the tail_lsn at all because we
don't need to push.
So it seems to me that the double sample of the tail_lsn doesn't
matter at all for this code. Is there a hole in my logic here?
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2010-12-21 1:25 UTC | newest]
Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-12-13 4:44 xfs: grant lock scaling and removal V3 Dave Chinner
2010-12-13 4:44 ` [PATCH 01/12] xfs: convert log grant ticket queues to list heads Dave Chinner
2010-12-20 11:34 ` Christoph Hellwig
2010-12-21 0:55 ` Dave Chinner
2010-12-13 4:44 ` [PATCH 02/12] xfs: fact out common grant head/log tail verification code Dave Chinner
2010-12-20 11:51 ` Christoph Hellwig
2010-12-13 4:44 ` [PATCH 03/12] xfs: rework log grant space calculations Dave Chinner
2010-12-20 11:37 ` Christoph Hellwig
2010-12-13 4:44 ` [PATCH 04/12] xfs: combine grant heads into a single 64 bit integer Dave Chinner
2010-12-20 11:40 ` Christoph Hellwig
2010-12-13 4:44 ` [PATCH 05/12] xfs: use wait queues directly for the log wait queues Dave Chinner
2010-12-20 11:40 ` Christoph Hellwig
2010-12-13 4:44 ` [PATCH 06/12] xfs: make AIL tail pushing independent of the grant lock Dave Chinner
2010-12-20 11:41 ` Christoph Hellwig
2010-12-13 4:44 ` [PATCH 07/12] xfs: convert l_last_sync_lsn to an atomic variable Dave Chinner
2010-12-13 4:44 ` [PATCH 08/12] xfs: convert l_tail_lsn " Dave Chinner
2010-12-20 12:06 ` Christoph Hellwig
2010-12-21 1:27 ` Dave Chinner
2010-12-13 4:44 ` [PATCH 09/12] xfs: convert log grant heads to atomic variables Dave Chinner
2010-12-13 4:44 ` [PATCH 10/12] xfs: introduce new locks for the log grant ticket wait queues Dave Chinner
2010-12-20 11:49 ` Christoph Hellwig
2010-12-13 4:44 ` [PATCH 11/12] xfs: convert grant head manipulations to lockless algorithm Dave Chinner
2010-12-20 11:50 ` Christoph Hellwig
2010-12-13 4:44 ` [PATCH 12/12] xfs: kill useless spinlock_destroy macro Dave Chinner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox