From: Ben Greear <greearb@candelatech.com>
To: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: linux-nfs@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC] sunrpc: Fix race between work-queue and rpc_killall_tasks.
Date: Wed, 06 Jul 2011 17:07:32 -0700 [thread overview]
Message-ID: <4E14F8C4.2010508@candelatech.com> (raw)
In-Reply-To: <1309995932.5447.6.camel@lade.trondhjem.org>
On 07/06/2011 04:45 PM, Trond Myklebust wrote:
> On Wed, 2011-07-06 at 15:49 -0700, greearb@candelatech.com wrote:
>> From: Ben Greear<greearb@candelatech.com>
>>
>> The rpc_killall_tasks logic is not locked against
>> the work-queue thread, but it still directly modifies
>> function pointers and data in the task objects.
>>
>> This patch changes the killall-tasks logic to set a flag
>> that tells the work-queue thread to terminate the task
>> instead of directly calling the terminate logic.
>>
>> Signed-off-by: Ben Greear<greearb@candelatech.com>
>> ---
>>
>> NOTE: This needs review, as I am still struggling to understand
>> the rpc code, and it's quite possible this patch either doesn't
>> fully fix the problem or actually causes other issues. That said,
>> my nfs stress test seems to run a bit more stable with this patch applied.
>
> Yes, but I don't see why you are adding a new flag, nor do I see why we
> want to keep checking for that flag in the rpc_execute() loop.
> rpc_killall_tasks() is not a frequent operation that we want to optimise
> for.
I was hoping that if the killall logic never set anything that was also
set by the work-queue thread it would be lock-safe without needing
explicit locking.
I was a bit concerned that my flags |= KILLME logic would potentially
over-write flags that were being simultaneously written elsewhere
(so maybe I'd have to add a completely new variable for that KILLME flag
to really be safe.)
>
> How about the following instead?
I think it still races..more comments below.
>
> 8<----------------------------------------------------------------------------------
> From ecb7244b661c3f9d2008ef6048733e5cea2f98ab Mon Sep 17 00:00:00 2001
> From: Trond Myklebust<Trond.Myklebust@netapp.com>
> Date: Wed, 6 Jul 2011 19:44:52 -0400
> Subject: [PATCH] SUNRPC: Fix a race between work-queue and rpc_killall_tasks
>
> Since rpc_killall_tasks may modify the rpc_task's tk_action field
> without any locking, we need to be careful when dereferencing it.
> + do_action = task->tk_callback;
> + task->tk_callback = NULL;
> + if (do_action == NULL) {
I think the race still exists, though it would be harder to hit.
What if the killall logic sets task->tk_callback right after you assign do_action, but before
you set tk_callback to NULL? Or after you set tk_callback to NULL for
that matter.
> /*
> * Perform the next FSM step.
> - * tk_action may be NULL when the task has been killed
> - * by someone else.
> + * tk_action may be NULL if the task has been killed.
> + * In particular, note that rpc_killall_tasks may
> + * do this at any time, so beware when dereferencing.
> */
> - if (task->tk_action == NULL)
> + do_action = task->tk_action;
> + if (do_action == NULL)
> break;
> - task->tk_action(task);
> }
> + do_action(task);
>
> /*
> * Lockless check for whether task is sleeping or not.
Thanks,
Ben
--
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc http://www.candelatech.com
next prev parent reply other threads:[~2011-07-07 0:07 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-07-06 22:49 [RFC] sunrpc: Fix race between work-queue and rpc_killall_tasks greearb
2011-07-06 23:45 ` Trond Myklebust
2011-07-06 23:45 ` Trond Myklebust
2011-07-07 0:07 ` Ben Greear [this message]
2011-07-07 0:17 ` Trond Myklebust
2011-07-07 0:35 ` Ben Greear
2011-07-07 20:38 ` Ben Greear
2011-07-08 15:03 ` Ben Greear
2011-07-08 17:18 ` Ben Greear
2011-07-08 18:11 ` Myklebust, Trond
2011-07-08 18:11 ` Myklebust, Trond
2011-07-08 22:03 ` Ben Greear
2011-07-08 22:14 ` Myklebust, Trond
2011-07-08 22:14 ` Myklebust, Trond
2011-07-09 16:34 ` Ben Greear
2011-07-12 17:14 ` Ben Greear
2011-07-12 17:25 ` Myklebust, Trond
2011-07-12 17:25 ` Myklebust, Trond
2011-07-12 17:30 ` Ben Greear
2011-07-14 16:20 ` Ben Greear
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4E14F8C4.2010508@candelatech.com \
--to=greearb@candelatech.com \
--cc=Trond.Myklebust@netapp.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.