All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jesper Nilsson <jesper.nilsson@axis.com>
To: Tejun Heo <tj@kernel.org>
Cc: Rabin Vincent <rabinv@axis.com>,
	Jesper Nilsson <jespern@axis.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH for-3.20-fixes] workqueue: fix hang involving racing cancel[_delayed]_work_sync()'s for PREEMPT_NONE
Date: Mon, 9 Feb 2015 17:29:55 +0100	[thread overview]
Message-ID: <20150209162955.GQ11399@axis.com> (raw)
In-Reply-To: <20150209161527.GH3220@htj.duckdns.org>

On Mon, Feb 09, 2015 at 05:15:27PM +0100, Tejun Heo wrote:
> Hello,

Hi,

> Can you please verify that the following patch fixes the issue?

Rabin will have to report if it fixes it for his synthetic case,
but I'll try it in my "real-world" jffs2 sync problem, and report
after a couple of hours.

> Thanks.

/Jesper

> -------- 8< --------
> cancel[_delayed]_work_sync() are implemented using
> __cancel_work_timer() which grabs the PENDING bit using
> try_to_grab_pending() and then flushes the work item with PENDING set
> to prevent the on-going execution of the work item from requeueing
> itself.
> 
> try_to_grab_pending() can always grab PENDING bit without blocking
> except when someone else is doing the above flushing during
> cancelation.  In that case, try_to_grab_pending() returns -ENOENT.  In
> this case, __cancel_work_timer() currently invokes flush_work().  The
> assumption is that the completion of the work item is what the other
> canceling task would be waiting for too and thus waiting for the same
> condition and retrying should allow forward progress without excessive
> busy looping
> 
> Unfortunately, this doesn't work if preemption is disabled or the
> latter task has real time priority.  Let's say task A just got woken
> up from flush_work() by the completion of the target work item.  If,
> before task A starts executing, task B gets scheduled and invokes
> __cancel_work_timer() on the same work item, its try_to_grab_pending()
> will return -ENOENT as the work item is still being canceled by task A
> and flush_work() will also immediately return false as the work item
> is no longer executing.  This puts task B in a busy loop possibly
> preventing task A from executing and clearing the canceling state on
> the work item leading to a hang.
> 
> task A			task B			worker
> 
> 						executing work
> __cancel_work_timer()
>   try_to_grab_pending()
>   set work CANCELING
>   flush_work()
>     block for work completion
> 						completion, wakes up A
> 			__cancel_work_timer()
> 			while (forever) {
> 			  try_to_grab_pending()
> 			    -ENOENT as work is being canceled
> 			  flush_work()
> 			    false as work is no longer executing
> 			}
> 
> This patch removes the possible hang by updating __cancel_work_timer()
> to explicitly wait for clearing of CANCELING rather than invoking
> flush_work() after try_to_grab_pending() fails with -ENOENT.  The
> explicit wait uses the matching bit waitqueue for the CANCELING bit.
> 
> Link: http://lkml.kernel.org/g/20150206171156.GA8942@axis.com
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Reported-by: Rabin Vincent <rabin.vincent@axis.com>
> Cc: stable@vger.kernel.org
> ---
>  include/linux/workqueue.h |    3 ++-
>  kernel/workqueue.c        |   36 ++++++++++++++++++++++++++++++++----
>  2 files changed, 34 insertions(+), 5 deletions(-)
> 
> --- a/include/linux/workqueue.h
> +++ b/include/linux/workqueue.h
> @@ -70,7 +70,8 @@ enum {
>  	/* data contains off-queue information when !WORK_STRUCT_PWQ */
>  	WORK_OFFQ_FLAG_BASE	= WORK_STRUCT_COLOR_SHIFT,
>  
> -	WORK_OFFQ_CANCELING	= (1 << WORK_OFFQ_FLAG_BASE),
> +	__WORK_OFFQ_CANCELING	= WORK_OFFQ_FLAG_BASE,
> +	WORK_OFFQ_CANCELING	= (1 << __WORK_OFFQ_CANCELING),
>  
>  	/*
>  	 * When a work item is off queue, its high bits point to the last
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -2730,17 +2730,35 @@ EXPORT_SYMBOL_GPL(flush_work);
>  
>  static bool __cancel_work_timer(struct work_struct *work, bool is_dwork)
>  {
> +	wait_queue_head_t *waitq = bit_waitqueue(&work->data,
> +						 __WORK_OFFQ_CANCELING);
>  	unsigned long flags;
>  	int ret;
>  
>  	do {
>  		ret = try_to_grab_pending(work, is_dwork, &flags);
>  		/*
> -		 * If someone else is canceling, wait for the same event it
> -		 * would be waiting for before retrying.
> +		 * If someone else is already canceling, wait for it to
> +		 * finish.  flush_work() doesn't work for PREEMPT_NONE
> +		 * because we may get scheduled between @work's completion
> +		 * and the other canceling task resuming and clearing
> +		 * CANCELING - flush_work() will return false immediately
> +		 * as @work is no longer busy, try_to_grab_pending() will
> +		 * return -ENOENT as @work is still being canceled and the
> +		 * other canceling task won't be able to clear CANCELING as
> +		 * we're hogging the CPU.
> +		 *
> +		 * Explicitly wait for completion using a bit waitqueue.
> +		 * We can't use wait_on_bit() as the CANCELING bit may get
> +		 * recycled to point to pwq if @work gets re-queued.
>  		 */
> -		if (unlikely(ret == -ENOENT))
> -			flush_work(work);
> +		if (unlikely(ret == -ENOENT)) {
> +			DEFINE_WAIT(wait);
> +			prepare_to_wait(waitq, &wait, TASK_UNINTERRUPTIBLE);
> +			if (work_is_canceling(work))
> +				schedule();
> +			finish_wait(waitq, &wait);
> +		}
>  	} while (unlikely(ret < 0));
>  
>  	/* tell other tasks trying to grab @work to back off */
> @@ -2749,6 +2767,16 @@ static bool __cancel_work_timer(struct w
>  
>  	flush_work(work);
>  	clear_work_data(work);
> +
> +	/*
> +	 * Paired with prepare_to_wait() above so that either
> +	 * waitqueue_active() is visible here or !work_is_canceling() is
> +	 * visible there.
> +	 */
> +	smp_mb();
> +	if (waitqueue_active(waitq))
> +		wake_up(waitq);
> +
>  	return ret;
>  }
>  

/^JN - Jesper Nilsson
-- 
               Jesper Nilsson -- jesper.nilsson@axis.com

  reply	other threads:[~2015-02-09 16:30 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-02-06 17:11 Soft lockups in __cancel_work_timer() Rabin Vincent
2015-02-06 18:01 ` Tejun Heo
2015-02-09 16:15 ` [PATCH for-3.20-fixes] workqueue: fix hang involving racing cancel[_delayed]_work_sync()'s for PREEMPT_NONE Tejun Heo
2015-02-09 16:29   ` Jesper Nilsson [this message]
2015-02-10  8:35     ` Jesper Nilsson
2015-02-10  9:33   ` Rabin Vincent
2015-03-02 12:26   ` Jesper Nilsson
2015-03-02 16:21     ` Tejun Heo
2015-03-03  7:52       ` Jesper Nilsson
2015-03-03 10:00       ` Tomeu Vizoso
2015-03-03 13:21         ` Tejun Heo
2015-03-03 13:45           ` [PATCH wq/for-4.0-fixes v2] " Tejun Heo
2015-03-03 13:57             ` Tomeu Vizoso
2015-03-05  9:24               ` Tomeu Vizoso
2015-03-05  9:36                 ` Tejun Heo
2015-03-05 11:46                   ` Tejun Heo
2015-03-05 13:03                 ` [PATCH v3 wq/for-4.0-fixes] " Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150209162955.GQ11399@axis.com \
    --to=jesper.nilsson@axis.com \
    --cc=jespern@axis.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rabinv@axis.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.