From: Benjamin LaHaise <bcrl@kvack.org>
To: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: linux-rt-users@vger.kernel.org,
Mike Galbraith <umgwanakikbuti@gmail.com>,
Kent Overstreet <kmo@daterainc.com>
Subject: Re: [PATCH-rt 1/2] aio: don't take the ctx_lock during rcu garbage collection at reboot
Date: Sun, 15 Feb 2015 18:50:05 -0500 [thread overview]
Message-ID: <20150215235005.GA26974@kvack.org> (raw)
In-Reply-To: <1424033559-63051-2-git-send-email-paul.gortmaker@windriver.com>
On Sun, Feb 15, 2015 at 03:52:38PM -0500, Paul Gortmaker wrote:
> On reboot in 3.14-rt we see this splat:
>
> BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:905
> in_atomic(): 1, irqs_disabled(): 0, pid: 84, name: rcuc/12
> Preemption disabled at:[<ffffffff810b29bc>] rcu_cpu_kthread+0x28c/0x700
>
> CPU: 18 PID: 121 Comm: rcuc/18 Tainted: G D 3.14.33-rt28-WR7.0.0.0_ovp+ #2
> Hardware name: Intel Corporation S2600CP/S2600CP, BIOS SE5C600.86B.02.01.0002.082220131453 08/22/2013
> ffff880fad3e7a70 ffff880fb11cfd58 ffffffff81a0bcee 0000000000000000
> ffff880fb11cfd70 ffffffff8107b5eb ffff880fad3e7a40 ffff880fb11cfd88
> ffffffff81a136b0 ffff880fad3e7900 ffff880fb11cfdb0 ffffffff811c925c
> Call Trace:
> [<ffffffff81a0bcee>] dump_stack+0x4e/0x7a
> [<ffffffff8107b5eb>] __might_sleep+0xdb/0x150
> [<ffffffff81a136b0>] rt_spin_lock+0x20/0x50
> [<ffffffff811c925c>] free_ioctx_users+0x2c/0xe0
> [<ffffffff8140136b>] percpu_ref_kill_rcu+0x11b/0x120
> [<ffffffff810aded5>] rcu_cpu_kthread+0x295/0x6f0
> [<ffffffff81076d8d>] smpboot_thread_fn+0x17d/0x2b0
> [<ffffffff81a110f0>] ? schedule+0x30/0xa0
> [<ffffffff81076c10>] ? SyS_setgroups+0x170/0x170
> [<ffffffff8106f4a4>] kthread+0xc4/0xe0
> [<ffffffff8106f3e0>] ? flush_kthread_worker+0x90/0x90
> [<ffffffff81a147ec>] ret_from_fork+0x7c/0xb0
> [<ffffffff8106f3e0>] ? flush_kthread_worker+0x90/0x90
>
> The problem is as stated in lib/percpu-refcount.c:
>
> * Note that @release must not sleep - it may potentially be called from RCU
> * callback context by percpu_ref_kill().
>
> So in -rt the non-raw spinlock trips due to this. The same problem was
> reported by Mike Galbraith earlier[1] but no conclusive fix was added
> to the -rt series from that thread. In one post[2], Mike mentions:
>
> I was sorely tempted to post a tiny variant that dropped taking
> ctx_lock in free_ioctx_users() entirely, as someone diddling with
> no reference didn't make sense.
>
> I was thinking the same thing but with a safety net in where we
> check and warn if anyone else tries to take the lock while we are
> doing the final garbage collection, which is what we now do here.
Your code introduces a bug: the lock is necessary to protect teardown
initiated cancellation of an iocb against the simultaneous completion of
that iocb. Checking the is locked state is not sufficient protection and
can lead to awful to debug bugs. So this is a NAK, unless you want to
introduce more bugs. You need to actually fix the problem while somehow
maintaining the consistency the lock provides.
-ben
> [1] https://lkml.org/lkml/2014/6/8/10
> [2] https://lkml.org/lkml/2014/6/9/979
>
> Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
> ---
> fs/aio.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/fs/aio.c b/fs/aio.c
> index 2f7e8c2e3e76..d806f7223822 100644
> --- a/fs/aio.c
> +++ b/fs/aio.c
> @@ -539,9 +539,14 @@ static void free_ioctx_users(struct percpu_ref *ref)
> struct kioctx *ctx = container_of(ref, struct kioctx, users);
> struct kiocb *req;
>
> +#ifndef CONFIG_PREEMPT_RT_BASE
> spin_lock_irq(&ctx->ctx_lock);
> +#endif
>
> while (!list_empty(&ctx->active_reqs)) {
> +#ifdef CONFIG_PREEMPT_RT_BASE
> + WARN_ON(spin_is_locked(&ctx->ctx_lock));
> +#endif
> req = list_first_entry(&ctx->active_reqs,
> struct kiocb, ki_list);
>
> @@ -549,7 +554,9 @@ static void free_ioctx_users(struct percpu_ref *ref)
> kiocb_cancel(ctx, req);
> }
>
> +#ifndef CONFIG_PREEMPT_RT_BASE
> spin_unlock_irq(&ctx->ctx_lock);
> +#endif
>
> percpu_ref_kill(&ctx->reqs);
> percpu_ref_put(&ctx->reqs);
> --
> 2.1.0
--
"Thought is the essence of where you are now."
next prev parent reply other threads:[~2015-02-15 23:50 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-02-15 20:52 [PATCH-rt 0/2] Fix up aio splats at reboot garbage collection Paul Gortmaker
2015-02-15 20:52 ` [PATCH-rt 1/2] aio: don't take the ctx_lock during rcu garbage collection at reboot Paul Gortmaker
2015-02-15 23:50 ` Benjamin LaHaise [this message]
2015-02-15 20:52 ` [PATCH-rt 2/2] aio: fix splat in free_ioctx_reqs by using percpu callback Paul Gortmaker
2015-02-18 11:28 ` [PATCH-rt 0/2] Fix up aio splats at reboot garbage collection Sebastian Andrzej Siewior
2015-02-18 13:33 ` Mike Galbraith
2015-02-20 17:31 ` Paul Gortmaker
2015-02-20 17:39 ` Benjamin LaHaise
2015-02-20 17:46 ` Paul Gortmaker
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150215235005.GA26974@kvack.org \
--to=bcrl@kvack.org \
--cc=kmo@daterainc.com \
--cc=linux-rt-users@vger.kernel.org \
--cc=paul.gortmaker@windriver.com \
--cc=umgwanakikbuti@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.