Re: [PATCH] Fix race between attach_task and cpuset_exit

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Paul Jackson <pj@sgi.com>
To: vatsa@in.ibm.com
Cc: balbir@in.ibm.com, akpm@linux-foundation.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] Fix race between attach_task and cpuset_exit
Date: Mon, 26 Mar 2007 11:30:47 -0700	[thread overview]
Message-ID: <20070326113047.faf591cb.pj@sgi.com> (raw)
In-Reply-To: <20070326115046.GK11794@in.ibm.com>

vatsa wrote:
> Well, someone may have attached to this cpuset while we were waiting on the 
> mutex_lock(). So we need to do a atomic_read again to ensure it is still
> unused

pj replied:
> If we hold the task lock that now
> (thanks to your good work) guards this pointer, and if we decrement to
> zero the reference count on the cpuset to which it points 

I incorrectly described the locking, I think.

A cpusets reference count increases if either another task is attached
to it, or if a task already attached forks.

If we decrement to zero the count, we -know- that no more tasks are
attached to it.

If we hold the cpuset manage_mutex, then we -know- that attach_task can't
attach tasks to it.

But now that you mention it, that additional atomic_read of the count in
check_for_release() seems suspicious to me.  I'm afraid that the following
could happen:

    1) given cpusets A and A/B, with a single task attached to A (none to B)
    2) some other tasks issues a "rmdir A/B"
    3) near the end of the cpuset_rmdir() code, after we have removed A/B, we
	invoke check_for_release()
    4) just at that instant, the single task in A exits, decrementing the
	count on A to zero
    5) both the exiting task and the task doing the rmdir execute the
	cpuset_release_agent() and check_for_release() code.

Aha - yes, maybe that could happen, but it is OK !!

Multiple tasks all pounding on the same cpuset with this release logic
is not a problem. That just ends up being multiple tasks doing a 'rmdir'
on that cpuset from user space.  At most one of them succeeds in
removing the directory, and if it is removed, then the remaining get an
error that there is no such directory.

The race I worried about in last nights post is NOT a problem:
> Is there perhaps another race here?  Could it happen that:
>  1) the atomic_dec() lowers the count to say one (any value > zero)
>  2) after we drop the task lock, some other task or tasks decrement
>     the count to zero
>  3) we catch that zero when we atomic_read the count, and issue a spurious
>     check_for_release().

This is one of the advantages of not actually unlinking cpusets at this point,
when it seems they are no longer used.  We just fire off a user mode helper
thread to attempt a subsequent removal.  That separate thread will get the locking
correct, from the top down, and if the cpuset is still really and truly unused,
then and only then actually remove it.

Simultaneous spurious check_for_release() calls are not a problem!

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.925.600.0401

next prev parent reply	other threads:[~2007-03-26 18:30 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-03-25 16:47 [PATCH] Fix race between attach_task and cpuset_exit Srivatsa Vaddagiri
2007-03-25 17:52 ` Balbir Singh
2007-03-25 19:54   ` Paul Jackson
2007-03-26 11:50   ` Srivatsa Vaddagiri
2007-03-26 17:58     ` Paul Jackson
2007-03-27  6:35       ` Srivatsa Vaddagiri
2007-03-27  8:45         ` Paul Jackson
2007-03-26 18:30     ` Paul Jackson [this message]
2007-03-25 19:50 ` Paul Jackson
2007-03-26 11:55   ` Srivatsa Vaddagiri
2007-04-05  5:55     ` Paul Menage
2007-04-05  7:00       ` Srivatsa Vaddagiri
2007-04-05  7:01         ` Paul Menage
2007-04-05  8:14           ` Srivatsa Vaddagiri
2007-04-05  8:10             ` Paul Menage
2007-04-10 17:12       ` Srivatsa Vaddagiri

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070326113047.faf591cb.pj@sgi.com \
    --to=pj@sgi.com \
    --cc=akpm@linux-foundation.org \
    --cc=balbir@in.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=vatsa@in.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox