public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Ben Myers <bpm@sgi.com>
To: Christoph Hellwig <hch@infradead.org>,
	Chandra Seetharaman <sekharan@us.ibm.com>
Cc: xfs@oss.sgi.com
Subject: Re: [PATCH 4/4] xfs: fix the logspace waiting algorithm
Date: Thu, 1 Dec 2011 12:32:33 -0600	[thread overview]
Message-ID: <20111201183233.GY29840@sgi.com> (raw)
In-Reply-To: <20111128081925.981681380@bombadil.infradead.org>

Hi,

On Mon, Nov 28, 2011 at 03:17:36AM -0500, Christoph Hellwig wrote:
> Apply the scheme used in log_regrant_write_log_space to wake up any other
> threads waiting for log space before the newly added one to
> log_regrant_write_log_space as well, and factor the code into readable
> helpers.  For each of the queues we have add two helpers:
> 
>  - one to try to wake up all waiting threads.  This helper will also be
>    usable by xfs_log_move_tail once we remove the current opportunistic
>    wakeups in it.
>  - one to sleep on t_wait until enough log space is available, loosely
>    modelled after Linux waitqueues.
>  
> And use them to reimplement the guts of log_regrant_write_log_space and
> log_regrant_write_log_space.  These two function now use one and the same
> algorithm for waiting on log space instead of subtly different ones before,
> with an option to completely unify them in the near future.
> 
> Also move the filesystem shutdown handling to the common caller given
> that we had to touch it anyway.
> 
> Based on hard debugging and an earlier patch from
> Chandra Seetharaman <sekharan@us.ibm.com>.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

I'd like to make sure that I understand the race that Chandra
debugged and reported.

2499 STATIC int
2500 xlog_grant_log_space(xlog_t        *log,
2501                      xlog_ticket_t *tic)
2502 {
2503         int              free_bytes;
2504         int              need_bytes;
...
2517         /* something is already sleeping; insert new transaction at end */
2518         if (!list_empty_careful(&log->l_reserveq)) {
2519                 spin_lock(&log->l_grant_reserve_lock);
2520                 /* recheck the queue now we are locked */
2521                 if (list_empty(&log->l_reserveq)) {
2522                         spin_unlock(&log->l_grant_reserve_lock);
2523                         goto redo;
2524                 }
2525                 list_add_tail(&tic->t_queue, &log->l_reserveq);
2526 
2527                 trace_xfs_log_grant_sleep1(log, tic);
2528 
2529                 /*
2530                  * Gotta check this before going to sleep, while we're
2531                  * holding the grant lock.
2532                  */
2533                 if (XLOG_FORCED_SHUTDOWN(log))
2534                         goto error_return;
2535 
2536                 XFS_STATS_INC(xs_sleep_logspace);
2537                 xlog_wait(&tic->t_wait, &log->l_grant_reserve_lock);
2538 
2539                 /*
2540                  * If we got an error, and the filesystem is shutting down,
2541                  * we'll catch it down below. So just continue...
2542                  */
2543                 trace_xfs_log_grant_wake1(log, tic);
2544         }
2545 
2546 redo:
2547         if (XLOG_FORCED_SHUTDOWN(log))
2548                 goto error_return_unlocked;
2549 
2550         free_bytes = xlog_space_left(log, &log->l_grant_reserve_head);
2551         if (free_bytes < need_bytes) {
2552                 spin_lock(&log->l_grant_reserve_lock);
2553                 if (list_empty(&tic->t_queue))
2554                         list_add_tail(&tic->t_queue, &log->l_reserveq);
2555    
2556                 trace_xfs_log_grant_sleep2(log, tic);
2557 
2558                 if (XLOG_FORCED_SHUTDOWN(log))
2559                         goto error_return;
2560 
2561                 xlog_grant_push_ail(log, need_bytes);
2562 
2563                 XFS_STATS_INC(xs_sleep_logspace);
2564                 xlog_wait(&tic->t_wait, &log->l_grant_reserve_lock);
2565 
2566                 trace_xfs_log_grant_wake2(log, tic);
2567                 goto redo;
2568         }
2569 
2570         if (!list_empty(&tic->t_queue)) {
2571                 spin_lock(&log->l_grant_reserve_lock);
2572                 list_del_init(&tic->t_queue);
2573                 spin_unlock(&log->l_grant_reserve_lock);
2574         }

So the race that we're looking at here is:

process A was added to the reserve queue at either 2525 or 2554 and, pushes the AIL at 2561,
xfsaild frees up enough log space for process A (possibly B?), eventually xfs_log_move_tail is called to wake process A,
process A wakes at line 2564, and he is on the reserveq already,
process B sees that there are tickets on the queue at 2518 and gets the grant reserve lock at 2519,
process A spins at 2571 waiting for the grant reserve lock,
process B adds itself to the queue at 2525, 
process B drops the grant reserve lock and goes to sleep at 2537
process A takes the grant reserve lock at 2571 and removes it's ticket from the list.

...and there is nothing to wake process B until the ail is pushed by
some other process.

Is that about right?

Thanks,
Ben

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  parent reply	other threads:[~2011-12-01 18:32 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-11-28  8:17 [PATCH 0/4] xfs fixes for Linux 3.2-rc3 Christoph Hellwig
2011-11-28  8:17 ` [PATCH 1/4] xfs: fix attr2 vs large data fork assert Christoph Hellwig
2011-11-28 18:01   ` Ben Myers
2011-11-28 18:03     ` Christoph Hellwig
2011-11-28  8:17 ` [PATCH 2/4] xfs: validate acl count Christoph Hellwig
2011-11-28  8:17 ` [PATCH 3/4] xfs: force buffer writeback before blocking on the ilock in inode reclaim Christoph Hellwig
2011-11-28  8:17 ` [PATCH 4/4] xfs: fix the logspace waiting algorithm Christoph Hellwig
2011-11-30 23:56   ` Ben Myers
2011-12-01  7:21     ` Christoph Hellwig
2011-12-01 19:51       ` Ben Myers
2011-12-02 11:51         ` Christoph Hellwig
2011-12-05  2:53           ` Dave Chinner
2011-12-05  9:05             ` Christoph Hellwig
2011-12-01 18:32   ` Ben Myers [this message]
2011-12-01 20:43     ` Chandra Seetharaman
2011-12-01 19:28   ` Chandra Seetharaman
2011-12-02 11:16     ` Christoph Hellwig
2011-12-02 16:02       ` Chandra Seetharaman
2011-11-29 19:23 ` [PATCH 0/4] xfs fixes for Linux 3.2-rc3 Ben Myers
2011-11-30  8:54   ` Christoph Hellwig
2011-11-30  8:58 ` [PATCH 5/4] xfs: fix nfs export of 64-bit inodes numbers on 32-bit kernels Christoph Hellwig
2011-12-02 16:07   ` Ben Myers
2011-12-02 17:29     ` Christoph Hellwig
2011-12-06 16:39       ` Ben Myers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111201183233.GY29840@sgi.com \
    --to=bpm@sgi.com \
    --cc=hch@infradead.org \
    --cc=sekharan@us.ibm.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox