linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
To: mc@linux.vnet.ibm.com
Cc: Al Viro <viro@ZenIV.linux.org.uk>,
	Stephen Boyd <sboyd@codeaurora.org>,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	Nick Piggin <npiggin@kernel.dk>,
	david@fromorbit.com,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	Maciej Rutecki <maciej.rutecki@gmail.com>
Subject: Re: [PATCH] VFS: br_write_lock locks on possible CPUs other than online CPUs
Date: Tue, 20 Dec 2011 16:06:59 +0530	[thread overview]
Message-ID: <4EF0654B.4060904@linux.vnet.ibm.com> (raw)
In-Reply-To: <1324373854.21588.16.camel@mengcong>

On 12/20/2011 03:07 PM, mengcong wrote:

> On Tue, 2011-12-20 at 12:58 +0530, Srivatsa S. Bhat wrote:
>> On 12/20/2011 11:57 AM, Al Viro wrote:
>>
>>> On Tue, Dec 20, 2011 at 10:26:05AM +0530, Srivatsa S. Bhat wrote:
>>>> Oh, right, that has to be handled as well...
>>>>
>>>> Hmmm... How about registering a CPU hotplug notifier callback during lock init
>>>> time, and then for every cpu that gets onlined (after we took a copy of the
>>>> cpu_online_mask to work with), we see if that cpu is different from the ones
>>>> we have already locked, and if it is, we lock it in the callback handler and
>>>> update the locked_cpu_mask appropriately (so that we release the locks properly
>>>> during the unlock operation).
>>>>
>>>> Handling the newly introduced race between the callback handler and lock-unlock
>>>> code must not be difficult, I believe..
>>>>
>>>> Any loopholes in this approach? Or is the additional complexity just not worth
>>>> it here?
>>>
>>> To summarize the modified variant of that approach hashed out on IRC:
>>>
>>> 	* lglock grows three extra things: spinlock, cpu bitmap and cpu hotplug
>>> notifier.
>>> 	* foo_global_lock_online starts with grabbing that spinlock and
>>> loops over the cpus in that bitmap.
>>> 	* foo_global_unlock_online loops over the same bitmap and then drops
>>> that spinlock
>>> 	* callback of the notifier is going to do all bitmap updates.  Under
>>> that spinlock.  Events that need handling definitely include the things like
>>> "was going up but failed", since we need the bitmap to contain all online CPUs
>>> at all time, preferably without too much junk beyond that.  IOW, we need to add
>>> it there _before_ low-level __cpu_up() calls set_cpu_online().  Which means
>>> that we want to clean up on failed attempt to up it.  Taking a CPU down is
>>> probably less PITA; just clear bit on the final "the sucker's dead" event.
>>> 	* bitmap is initialized once, at the same time we set the notifier
>>> up.  Just grab the spinlock and do
>>> 	for_each_online_cpu(N)
>>> 		add N to bitmap
>>> then release the spinlock and let the callbacks handle all updates.
>>>
>>> I think that'll work with relatively little pain, but I'm not familiar enough
>>> with the cpuhotplug notifiers, so I'd rather have the folks familiar with those
>>> to supply the set of events to watch for...
>>>
>>
>>
>> We need not watch out for "up failed" events. It is enough if we handle
>> CPU_ONLINE and CPU_DEAD events. Because, these 2 events are triggered only
>> upon successful online or offline operation, and these notifications are
>> more than enough for our purpose (to update our bitmaps). Also, those cpus
>> which came online wont start running until these "success notifications"
>> are all done, which is where we do our stuff in the callback (ie., try
>> grabbing the spinlock..).
>>
>> Of course, by doing this (only looking out for CPU_ONLINE and CPU_DEAD
>> events), our bitmap will probably be one step behind cpu_online_mask
>> (which means, we'll still have to take the snapshot of cpu_online_mask and
>> work with it instead of using for_each_online_cpu()).
>> But that doesn't matter, as long as:
>>   * we don't allow the newly onlined CPU to start executing code (this
>>     is achieved by taking the spinlock in the callback)
> 
> I think cpu notifier callback doesn't always run on the UPing cpu.
> Actually, it rarely runs on the UPing cpu.
> If I was wrong about the above thought, there is still a chance that lg-lock
> operations are scheduled on the UPing cpu before calling the callback.
> 


I wasn't actually banking on that, but you have raised a very good point.
The scheduler uses its own set of cpu hotplug callback handlers to start
using the newly added cpu (see the set of callbacks in kernel/sched.c)

So, now we have a race between our callback and the scheduler's callbacks.
("Placing" our callback appropriately in a safe position using priority
for notifiers doesn't appeal to me that much, since it looks like too much
hackery. It should probably be our last resort).

Regards,
Srivatsa S. Bhat


  reply	other threads:[~2011-12-20 10:37 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-12-19  3:36 [PATCH] VFS: br_write_lock locks on possible CPUs other than online CPUs mengcong
2011-12-19  4:11 ` Al Viro
2011-12-19  5:00   ` Dave Chinner
2011-12-19  6:07     ` mengcong
2011-12-19  7:31 ` Srivatsa S. Bhat
2011-12-19  9:12   ` Stephen Boyd
2011-12-19 11:03     ` Srivatsa S. Bhat
2011-12-19 12:11       ` Al Viro
2011-12-19 20:23         ` Srivatsa S. Bhat
2011-12-19 20:52           ` Al Viro
2011-12-20  4:56             ` Srivatsa S. Bhat
2011-12-20  6:27               ` Al Viro
2011-12-20  7:28                 ` Srivatsa S. Bhat
2011-12-20  9:37                   ` mengcong
2011-12-20 10:36                     ` Srivatsa S. Bhat [this message]
2011-12-20 11:08                       ` Srivatsa S. Bhat
2011-12-20 12:50                         ` Srivatsa S. Bhat
2011-12-20 14:06                           ` Al Viro
2011-12-20 14:35                             ` Srivatsa S. Bhat
2011-12-20 17:59                               ` Al Viro
2011-12-20 19:12                                 ` Srivatsa S. Bhat
2011-12-20 19:58                                   ` Al Viro
2011-12-20 22:27                                     ` Dave Chinner
2011-12-20 23:31                                       ` Al Viro
2011-12-21 21:15                                     ` Srivatsa S. Bhat
2011-12-21 22:02                                       ` Al Viro
2011-12-21 22:12                                       ` Andrew Morton
2011-12-22  7:02                                         ` Al Viro
2011-12-22  7:20                                           ` Andrew Morton
2011-12-22  8:08                                             ` Al Viro
2011-12-22  8:17                                               ` Andi Kleen
2011-12-22  8:39                                                 ` Al Viro
2011-12-22  8:22                                             ` Andi Kleen
2011-12-20  7:30                 ` mengcong
2011-12-20  7:37                   ` Srivatsa S. Bhat
2011-12-19 23:56         ` Dave Chinner
2011-12-20  4:05           ` Al Viro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4EF0654B.4060904@linux.vnet.ibm.com \
    --to=srivatsa.bhat@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@fromorbit.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maciej.rutecki@gmail.com \
    --cc=mc@linux.vnet.ibm.com \
    --cc=npiggin@kernel.dk \
    --cc=sboyd@codeaurora.org \
    --cc=viro@ZenIV.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).