public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Paul Jackson <pj@sgi.com>
To: Roman Zippel <zippel@linux-m68k.org>
Cc: akpm@osdl.org, torvalds@osdl.org, Simon.Derr@bull.net,
	linux-kernel@vger.kernel.org, nikita@clusterfs.com
Subject: Re: [PATCH] cpuset semaphore depth check optimize
Date: Thu, 15 Sep 2005 10:45:35 -0700	[thread overview]
Message-ID: <20050915104535.6058bbda.pj@sgi.com> (raw)
In-Reply-To: <Pine.LNX.4.61.0509150116150.3728@scrub.home>

Roman wrote:
> Sorry, I thought it was more obvious... :)

Likely it was obvious.  On the other hand, I wouldn't be in this mess
if I understood everything I should here.

Most of what you wrote this time actually made sense to me as I read
it.  Thanks for taking the time.

One part left to discuss and then I will put this aside for a few
days while I focus on my day job.


Roman, replying to pj:
> > > I don't think that works at all.  Consider the following sequence:
> > > 	1) ...
> > > 	4) Oops - we just missed doing a release.
> > 
> > If you want to do it like this, just add an else before the last 
> > atomic_dec().
> > The main point I was trying to make is that clearing tsk->cpuset doesn't 
> > require the spinlock, as long as you check for dead tasks in 
> > attach_task(). Ignore the rest if it's too confusing.

I don't think confusion is the problem here.  I think your proposal was
wrong.

There are *three* ways to get to a cpuset:
 1) via some task->cpuset pointer, 
 2) via that cpuset's path below /dev/cpuset, and
 3) walking up the parent chain from a child.

In the first half of your line:

> > 	if (atomic_read(&cs->count) == 1 && notify_on_release(cs)) {

if per chance the cs->count was one, then for an instant no other task
was using this cpuset, and it had no children.  But you can still get
to the cpuset via its /dev/cpuset path.

So by the time we get to the second half of this line where we check
for "notify_on_release(cs)", all hell could have broken loose, and
there might be 17 tasks using this self same cpuset, and 19 child
cpusets of this cpuset.  These interlopers. could have arrived by
accessing the cpuset using its path below /dev/cpuset.

The flip side is just as plausible.  We cannot, in any case, execute an
unguarded atomic_dec on cpuset->count, if that cpuset has been marked
notify_on_release, and if that cpuset is accessible by any of the
above three possible ways, due to the risk the decrement will put the
count to zero, and we'd miss issuing a release notifier.

Actually, even the single check that is in the code now:

        if (notify_on_release(cs)) {

would be bogus if we required instruction level synchronization, but it
is racing on an attribute set by some poor schlob user, so the kernel
need only preserve ordering at the system call level, not at the machine
instruction level.

Putting that part aside, why would you make a point of stating that
"that clearing tsk->cpuset doesn't require the spinlock"?  I don't
take cpuset_sem when I clear tsk->cpuset, so why would you think I'd
take this new spinlock instead?

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.925.600.0401

  reply	other threads:[~2005-09-15 17:46 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-09-12 11:30 [PATCH] cpuset semaphore depth check optimize Paul Jackson
2005-09-12 11:39 ` Andrew Morton
2005-09-12 14:38   ` Linus Torvalds
2005-09-12 15:32     ` Paul Jackson
     [not found]     ` <17189.39100.862561.865802@gargle.gargle.HOWL>
2005-09-12 15:38       ` Paul Jackson
2005-09-13  7:04     ` Ingo Molnar
2005-09-13 22:21       ` Paul Jackson
2005-09-12 14:51   ` Paul Jackson
2005-09-12 17:07     ` Roman Zippel
     [not found]       ` <20050912153135.3812d8e2.pj@sgi.com>
2005-09-13 11:42         ` Roman Zippel
2005-09-13 17:37           ` Paul Jackson
2005-09-14  6:01             ` Paul Jackson
2005-09-14 15:56             ` Roman Zippel
2005-09-14 19:46               ` Paul Jackson
2005-09-15 10:51                 ` Roman Zippel
2005-09-15 17:45                   ` Paul Jackson [this message]
2005-09-15 19:18                     ` Paul Jackson
2005-09-17  2:06                     ` Roman Zippel
2005-09-17  2:27                       ` Roman Zippel
2005-09-20  7:57                     ` Paul Jackson
2005-09-20 12:05                       ` Robin Holt
2005-09-20 12:21                         ` Robin Holt
2005-09-20 13:10                         ` Simon Derr
2005-09-20 14:22                         ` Paul Jackson
2005-09-20 14:54                           ` Robin Holt
2005-09-20 15:07                             ` Linus Torvalds
2005-09-20 15:14                             ` Paul Jackson
2005-09-20 15:17                             ` Simon Derr
2005-09-12 11:48 ` Nikita Danilov
2005-09-12 13:25   ` Paul Jackson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20050915104535.6058bbda.pj@sgi.com \
    --to=pj@sgi.com \
    --cc=Simon.Derr@bull.net \
    --cc=akpm@osdl.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nikita@clusterfs.com \
    --cc=torvalds@osdl.org \
    --cc=zippel@linux-m68k.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox