From: Timothy Shimmin <tes@sgi.com>
To: David Chinner <dgc@sgi.com>
Cc: xfs-dev <xfs-dev@sgi.com>, xfs-oss <xfs@oss.sgi.com>
Subject: Re: [patch] Use atomics for iclog reference counting
Date: Fri, 15 Feb 2008 14:34:53 +1100 [thread overview]
Message-ID: <47B5085D.30409@sgi.com> (raw)
In-Reply-To: <20080215002429.GT155259@sgi.com>
Dave,
So a bunch of incs/decs/tests converted to the atomic versions.
And the interesting stuff appears to be in xlog_state_release_iclog().
Okay that looks reasonable.
If the decrement of the cnt doesn't go down to zero then we just
return straight away - because we won't be going to sync anything.
And if we do go to zero then we take the lock and continue.
Why do we test for the error/EIO beforehand now too?
Because we don't want to return 0 if we have an error to return?
Seems good.
In the 1st 2 cases of the patch:
> @@ -675,7 +675,7 @@ xfs_log_unmount_write(xfs_mount_t *mp)
>
> spin_lock(&log->l_icloglock);
> iclog = log->l_iclog;
> - iclog->ic_refcnt++;
> + atomic_inc(&iclog->ic_refcnt);
> spin_unlock(&log->l_icloglock);
> xlog_state_want_sync(log, iclog);
> (void) xlog_state_release_iclog(log, iclog);
> @@ -713,7 +713,7 @@ xfs_log_unmount_write(xfs_mount_t *mp)
> */
> spin_lock(&log->l_icloglock);
> iclog = log->l_iclog;
> - iclog->ic_refcnt++;
> + atomic_inc(&iclog->ic_refcnt);
> spin_unlock(&log->l_icloglock);
Do we still really need to take the lock etc?
--Tim
David Chinner wrote:
> On Fri, Feb 15, 2008 at 10:47:58AM +1100, David Chinner wrote:
>> On Wed, Jan 23, 2008 at 04:13:25PM +1100, Timothy Shimmin wrote:
>>> I'll have a look...
>> Tim, have you had a chance to look at this one yet? I'd like to
>> push this too, but I understand you are kinda busy right now :/
>
> FWIW, you might want to review this version ;)
>
> ----
>
> Now that we update the log tail LSN less frequently on
> transaction completion, we pass the contention straight to
> the global log state lock (l_iclog_lock) during transaction
> completion.
>
> We currently have to take this lock to decrement the iclog
> reference count. there is a reference count on each iclog,
> so we need to take the global lock for all refcount changes.
>
> When large numbers of processes are all doing small trnasctions,
> the iclog reference counts will be quite high, and the state change
> that absolutely requires the l_iclog_lock is the except rather than
> the norm.
>
> Change the reference counting on the iclogs to use atomic_inc/dec
> so that we can use atomic_dec_and_lock during transaction completion
> and avoid the need for grabbing the l_iclog_lock for every reference
> count decrement except the one that matters - the last.
>
> Version 2:
> o remove spurious unlock in shutdown path in xlog_state_release_iclog()
>
> Signed-off-by: Dave Chinner <dgc@sgi.com>
> ---
> fs/xfs/xfs_log.c | 36 ++++++++++++++++++++----------------
> fs/xfs/xfs_log_priv.h | 2 +-
> fs/xfs/xfsidbg.c | 2 +-
> 3 files changed, 22 insertions(+), 18 deletions(-)
>
> Index: 2.6.x-xfs-new/fs/xfs/xfs_log.c
> ===================================================================
> --- 2.6.x-xfs-new.orig/fs/xfs/xfs_log.c 2008-02-15 11:19:08.076544539 +1100
> +++ 2.6.x-xfs-new/fs/xfs/xfs_log.c 2008-02-15 11:20:22.558911855 +1100
> @@ -675,7 +675,7 @@ xfs_log_unmount_write(xfs_mount_t *mp)
>
> spin_lock(&log->l_icloglock);
> iclog = log->l_iclog;
> - iclog->ic_refcnt++;
> + atomic_inc(&iclog->ic_refcnt);
> spin_unlock(&log->l_icloglock);
> xlog_state_want_sync(log, iclog);
> (void) xlog_state_release_iclog(log, iclog);
> @@ -713,7 +713,7 @@ xfs_log_unmount_write(xfs_mount_t *mp)
> */
> spin_lock(&log->l_icloglock);
> iclog = log->l_iclog;
> - iclog->ic_refcnt++;
> + atomic_inc(&iclog->ic_refcnt);
> spin_unlock(&log->l_icloglock);
>
> xlog_state_want_sync(log, iclog);
> @@ -1405,7 +1405,7 @@ xlog_sync(xlog_t *log,
> int v2 = XFS_SB_VERSION_HASLOGV2(&log->l_mp->m_sb);
>
> XFS_STATS_INC(xs_log_writes);
> - ASSERT(iclog->ic_refcnt == 0);
> + ASSERT(atomic_read(&iclog->ic_refcnt) == 0);
>
> /* Add for LR header */
> count_init = log->l_iclog_hsize + iclog->ic_offset;
> @@ -2312,7 +2312,7 @@ xlog_state_done_syncing(
>
> ASSERT(iclog->ic_state == XLOG_STATE_SYNCING ||
> iclog->ic_state == XLOG_STATE_IOERROR);
> - ASSERT(iclog->ic_refcnt == 0);
> + ASSERT(atomic_read(&iclog->ic_refcnt) == 0);
> ASSERT(iclog->ic_bwritecnt == 1 || iclog->ic_bwritecnt == 2);
>
>
> @@ -2394,7 +2394,7 @@ restart:
> ASSERT(iclog->ic_state == XLOG_STATE_ACTIVE);
> head = &iclog->ic_header;
>
> - iclog->ic_refcnt++; /* prevents sync */
> + atomic_inc(&iclog->ic_refcnt); /* prevents sync */
> log_offset = iclog->ic_offset;
>
> /* On the 1st write to an iclog, figure out lsn. This works
> @@ -2426,12 +2426,12 @@ restart:
> xlog_state_switch_iclogs(log, iclog, iclog->ic_size);
>
> /* If I'm the only one writing to this iclog, sync it to disk */
> - if (iclog->ic_refcnt == 1) {
> + if (atomic_read(&iclog->ic_refcnt) == 1) {
> spin_unlock(&log->l_icloglock);
> if ((error = xlog_state_release_iclog(log, iclog)))
> return error;
> } else {
> - iclog->ic_refcnt--;
> + atomic_dec(&iclog->ic_refcnt);
> spin_unlock(&log->l_icloglock);
> }
> goto restart;
> @@ -2822,18 +2822,21 @@ xlog_state_release_iclog(
> {
> int sync = 0; /* do we sync? */
>
> - spin_lock(&log->l_icloglock);
> + if (iclog->ic_state & XLOG_STATE_IOERROR)
> + return XFS_ERROR(EIO);
> +
> + ASSERT(atomic_read(&iclog->ic_refcnt) > 0);
> + if (!atomic_dec_and_lock(&iclog->ic_refcnt, &log->l_icloglock))
> + return 0;
> +
> if (iclog->ic_state & XLOG_STATE_IOERROR) {
> spin_unlock(&log->l_icloglock);
> return XFS_ERROR(EIO);
> }
> -
> - ASSERT(iclog->ic_refcnt > 0);
> ASSERT(iclog->ic_state == XLOG_STATE_ACTIVE ||
> iclog->ic_state == XLOG_STATE_WANT_SYNC);
>
> - if (--iclog->ic_refcnt == 0 &&
> - iclog->ic_state == XLOG_STATE_WANT_SYNC) {
> + if (iclog->ic_state == XLOG_STATE_WANT_SYNC) {
> /* update tail before writing to iclog */
> xlog_assign_tail_lsn(log->l_mp);
> sync++;
> @@ -2953,7 +2956,8 @@ xlog_state_sync_all(xlog_t *log, uint fl
> * previous iclog and go to sleep.
> */
> if (iclog->ic_state == XLOG_STATE_DIRTY ||
> - (iclog->ic_refcnt == 0 && iclog->ic_offset == 0)) {
> + (atomic_read(&iclog->ic_refcnt) == 0
> + && iclog->ic_offset == 0)) {
> iclog = iclog->ic_prev;
> if (iclog->ic_state == XLOG_STATE_ACTIVE ||
> iclog->ic_state == XLOG_STATE_DIRTY)
> @@ -2961,14 +2965,14 @@ xlog_state_sync_all(xlog_t *log, uint fl
> else
> goto maybe_sleep;
> } else {
> - if (iclog->ic_refcnt == 0) {
> + if (atomic_read(&iclog->ic_refcnt) == 0) {
> /* We are the only one with access to this
> * iclog. Flush it out now. There should
> * be a roundoff of zero to show that someone
> * has already taken care of the roundoff from
> * the previous sync.
> */
> - iclog->ic_refcnt++;
> + atomic_inc(&iclog->ic_refcnt);
> lsn = be64_to_cpu(iclog->ic_header.h_lsn);
> xlog_state_switch_iclogs(log, iclog, 0);
> spin_unlock(&log->l_icloglock);
> @@ -3100,7 +3104,7 @@ try_again:
> already_slept = 1;
> goto try_again;
> } else {
> - iclog->ic_refcnt++;
> + atomic_inc(&iclog->ic_refcnt);
> xlog_state_switch_iclogs(log, iclog, 0);
> spin_unlock(&log->l_icloglock);
> if (xlog_state_release_iclog(log, iclog))
> Index: 2.6.x-xfs-new/fs/xfs/xfs_log_priv.h
> ===================================================================
> --- 2.6.x-xfs-new.orig/fs/xfs/xfs_log_priv.h 2008-02-15 11:19:08.080544022 +1100
> +++ 2.6.x-xfs-new/fs/xfs/xfs_log_priv.h 2008-02-15 11:19:14.403726218 +1100
> @@ -339,7 +339,7 @@ typedef struct xlog_iclog_fields {
> #endif
> int ic_size;
> int ic_offset;
> - int ic_refcnt;
> + atomic_t ic_refcnt;
> int ic_bwritecnt;
> ushort_t ic_state;
> char *ic_datap; /* pointer to iclog data */
> Index: 2.6.x-xfs-new/fs/xfs/xfsidbg.c
> ===================================================================
> --- 2.6.x-xfs-new.orig/fs/xfs/xfsidbg.c 2008-02-15 11:19:08.096541953 +1100
> +++ 2.6.x-xfs-new/fs/xfs/xfsidbg.c 2008-02-15 11:19:14.407725701 +1100
> @@ -5633,7 +5633,7 @@ xfsidbg_xiclog(xlog_in_core_t *iclog)
> #else
> NULL,
> #endif
> - iclog->ic_refcnt, iclog->ic_bwritecnt);
> + atomic_read(&iclog->ic_refcnt), iclog->ic_bwritecnt);
> if (iclog->ic_state & XLOG_STATE_ALL)
> printflags(iclog->ic_state, ic_flags, " state:");
> else
>
next prev parent reply other threads:[~2008-02-15 3:34 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-01-21 5:30 [patch] Use atomics for iclog reference counting David Chinner
2008-01-23 5:13 ` Timothy Shimmin
2008-02-14 23:47 ` David Chinner
2008-02-15 0:24 ` David Chinner
2008-02-15 3:34 ` Timothy Shimmin [this message]
2008-02-15 4:27 ` David Chinner
2008-02-15 5:05 ` Timothy Shimmin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=47B5085D.30409@sgi.com \
--to=tes@sgi.com \
--cc=dgc@sgi.com \
--cc=xfs-dev@sgi.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.