All of lore.kernel.org
 help / color / mirror / Atom feed
From: Timothy Shimmin <tes@sgi.com>
To: Eric Sandeen <sandeen@sandeen.net>
Cc: markgw@sgi.com, xfs-oss <xfs@oss.sgi.com>
Subject: Re: deadlocked xfs
Date: Fri, 11 Jul 2008 13:21:39 +1000	[thread overview]
Message-ID: <4876D1C3.4000006@sgi.com> (raw)
In-Reply-To: <4876C9EB.7060601@sgi.com>

Mark Goodwin wrote:
> Thanks for the report Eric. This looks very similar to a
> deadlock Lachlan recently hit in the patch for
> "Use atomics for iclog reference counting"
> http://oss.sgi.com/archives/xfs/2008-02/msg00130.html
> 
> It seems this patch can cause deadlocks under heavy log traffic.
> I don't think anyone has a fix yet ... Lachlan is out this week,
> but Tim can follow-up here ...
> 
> Cheers
> -- Mark
> 

Okay, what Mark is talking about is pv#983925.
Details below from the bug - from Lachlan and me.
I'm sorry that this info didn't go out sooner.
I'm not sure if Lachlan got any further with this but he's
been away.

--Tim

Lachlan wrote:
[....stuff deleted...]
> It's the same signature every time with the current iclog in WANT_SYNC state
> and the rest of the iclogs in DO_CALLBACK state.  The current iclog log should
> transition from WANT_SYNC to SYNCING to DONE_SYNC and eventually to ACTIVE so
> the threads above can continue but it's not a happening.
> 
> I suspect this mod is the culprit:
> 
> mod xfs-linux-melb:xfs-kern:30505a header
> ==========================================
>  - SM_Location:  longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb
>  - Workarea:     chook.melbourne.sgi.com:/build/dgc/isms/2.6.x-xfs
>  - xfs-linux-melb:xfs-kern:30505a 02/15/08
>  - PV Incidents affected:        975671
>  - Inspected by:         tes@sgi.com
>  - Description:
>    Use atomics for iclog reference counting
> 
>    Now that we update the log tail LSN less frequently on
>    transaction completion, we pass the contention straight to
>    the global log state lock (l_iclog_lock) during transaction
>    completion.
>    
>    We currently have to take this lock to decrement the iclog
>    reference count. there is a reference count on each iclog,
>    so we need to take &#65533;he global lock for all refcount changes.
>    
>    When large numbers of processes are all doing small trnasctions,
>    the iclog reference counts will be quite high, and the state change
>    that absolutely requires the l_iclog_lock is the except rather than
>    the norm.
>    
>    Change the reference counting on the iclogs to use atomic_inc/dec
>    so that we can use atomic_dec_and_lock during transaction completion
>    and avoid the need for grabbing the l_iclog_lock for every reference
>    count decrement except the one that matters - the last.
>  - Files affected:  xfsidbg.c 1.343, xfs_log.c 1.347,
>    xfs_log_priv.h 1.125
>  - Author:       dgc
>  - xfsidbg.c:
>    use correct atomic accessor for the iclog refcount.
>  - xfs_log.c:
>    Reduce contention on the iclog state lock by using atomic
>    reference counters for the iclogs and only grabbing the
>    iclog lock on transaction completion when the last reference
>    to the iclog is being removed.
>  - xfs_log_priv.h:
>    change ic_refcount to an atomic_t.
> 
> 
> It made this change which moved the refcount 'decrement and test' operation
> outside the l_icloglock spinlock and therefore can race. 
> 
> @@ -2819,18 +2819,21 @@ xlog_state_release_iclog(
>  {
>         int             sync = 0;       /* do we sync? */
>  
> -       spin_lock(&log->l_icloglock);
> +       if (iclog->ic_state & XLOG_STATE_IOERROR)
> +               return XFS_ERROR(EIO);
> +
> +       ASSERT(atomic_read(&iclog->ic_refcnt) > 0);
> +       if (!atomic_dec_and_lock(&iclog->ic_refcnt, &log->l_icloglock))
> +               return 0;
> +
>         if (iclog->ic_state & XLOG_STATE_IOERROR) {
>                 spin_unlock(&log->l_icloglock);
>                 return XFS_ERROR(EIO);
>         }
> 
> The above code will race with this code in xlog_state_get_iclog_space():
> 
> 	spin_lock(&log->l_icloglock);
> 	atomic_inc(&iclog->ic_refcnt);	/* prevents sync */
> 
> 	...
> 
> 	if (iclog->ic_size - iclog->ic_offset < 2*sizeof(xlog_op_header_t)) {
> 		xlog_state_switch_iclogs(log, iclog, iclog->ic_size);
> 
> 		/* If I'm the only one writing to this iclog, sync it to disk */
> 		if (atomic_read(&iclog->ic_refcnt) == 1) {
> 			spin_unlock(&log->l_icloglock);
> 			if ((error = xlog_state_release_iclog(log, iclog)))
> 				return error;
> 		} else {
> 			atomic_dec(&iclog->ic_refcnt);
> 			spin_unlock(&log->l_icloglock);
> 		}
> 		goto restart;
> 	}
> 
> Previously xlog_state_release_iclog() would wait for the spinlock to be
> released above but will now do the atomic_dec_and_lock() while the above
> code is executing.  The above atomic_read() will return 2, the
> atomic_dec_and_lock() in xlog_state_release_iclog() will return false and
> return without syncing the iclog and then the above code will decrement the
> atomic counter.  The iclog never gets out of WANT_SYNC state and everything
> gets stuck behind it.
> 
> 
> ==========================
> ADDITIONAL INFORMATION (ADD)
> From: tes@sgi.com (BugWorks)
> Date: Jul 01 2008 09:59:49PM
> ==========================
> 
> Hi,
> 
> Okay, I've just been trying to understand what Lachlan was saying
> about the problem and Lachlan kindly gave me an explanation.
> 
> Based on that, below is my understanding so I remember what is going
> wrong here ;-) Correct me if I got this wrong.
> 
> 
> Bad scenario (current code)
> ----------------------------
> Process A has a reference on iclog
> and iclog-refcnt == 1
> 
> Process B tries to get some log space,
> and takes lock and bumps count to 2
> Process B tests to see if its refcnt is just 1
> but it isn't so it moves on, _up_ to the refcnt decrement
> 
> Process A decides to release its iclog
> and calls xlog_state_release_iclog
> It does a decrement WITHOUT taking lock
> The refcnt goes from 2 to 1 and so it just
> returns without sync'ing.
> 
> Process B now does its decrement and takes
> the refcnt from 1 to 0. and releases the lock.
> 
> 
> Good scenario (with previous code)
> -----------------------------------
> Process A has a reference on iclog
> and iclog-refcnt == 1
> 
> Process B tries to get some logspace,
> and takes lock and bumps count to 2.
> Process B tests to see if its refcnt is just 1
> but it isn't so it moves on, _up_ to the refcnt decrement.
> 
> Process A decides to release its iclog
> and calls xlog_state_release_iclog.
> However, it can't get the lock yet and so it waits.
> 
> Process B now decrements its refcnt from 2
> back to 1 and releases the lock.
> 
> Process A now can get its lock and decrements
> the refcnt, finds it goes from 1 to zero and
> can now do the sync'ing.
> -----------------------------------
> 
> --Tim

  parent reply	other threads:[~2008-07-11  3:20 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-07-11  2:33 deadlocked xfs Eric Sandeen
2008-07-11  2:48 ` Mark Goodwin
2008-07-11  2:56   ` Eric Sandeen
2008-07-11  3:21   ` Timothy Shimmin [this message]
2008-07-11  4:04     ` Dave Chinner
2008-07-11  4:10       ` Eric Sandeen
2008-07-11  4:13         ` Timothy Shimmin
2008-07-11  4:18           ` Eric Sandeen
2008-07-11  4:26             ` Eric Sandeen
2008-07-11  4:27             ` Dave Chinner
2008-07-11  4:17       ` Christoph Hellwig
2008-07-11  4:44       ` Timothy Shimmin
2008-07-11  3:22   ` Dave Chinner
2008-07-11  3:50     ` Mark Goodwin
2008-07-11  4:02       ` Eric Sandeen
2008-07-11  4:05       ` Dave Chinner
2008-07-11  4:08       ` Timothy Shimmin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4876D1C3.4000006@sgi.com \
    --to=tes@sgi.com \
    --cc=markgw@sgi.com \
    --cc=sandeen@sandeen.net \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.