From: Keith Owens <kaos@ocs.com.au>
To: Rusty Russell <rusty@rustcorp.com.au>
Cc: linux-kernel@vger.kernel.org, mingo@elte.hu
Subject: Re: Is stopmachine() preempt safe?
Date: Mon, 28 Aug 2006 16:36:56 +1000 [thread overview]
Message-ID: <10518.1156747016@kao2.melbourne.sgi.com> (raw)
In-Reply-To: Your message of "Mon, 28 Aug 2006 16:17:11 +1000." <1156745831.10467.44.camel@localhost.localdomain>
Rusty Russell (on Mon, 28 Aug 2006 16:17:11 +1000) wrote:
>On Mon, 2006-08-28 at 12:55 +1000, Keith Owens wrote:
>> Rusty Russell (on Mon, 28 Aug 2006 09:38:55 +1000) wrote:
>> >On Sun, 2006-08-27 at 19:42 +1000, Keith Owens wrote:
>> >> I cannot convince myself that stopmachine() is preempt safe. What
>> >> prevents this race with CONFIG_PREEMPT=y?
>> >
>> >Nothing. Read side is preempt_disable. Write side is stopmachine.
>>
>> That is very worrying. The whole point of stopmachine is to get all
>> cpus to a known state with no locally cached global data, so the caller
>> of stopmachine can safely fiddle with some global data (like updating
>> the module lists). But CONFIG_PREEMPT defeats this and turns any code
>> that relies on stopmachine into a race.
>
>Hi Keith,
>
> Do not panic: this is not a bug, and was never how stop_machine was to
>be used. try_module_get() disables preemption, and you are supposed to
>disable preemption
Disabling preemption only guarantees that the current task will not be
preempted. It says nothing about the state of tasks that were already
running on the cpus when stopmachine was called. It is these tasks
that were already running and were preempted by stopmachine that have
to be flushed before stopmachine can continue. Remember that this is
the race:
cpu 0 cpu 1
stop_machine()
Process <n> reads a global resource
do_stop()
kernel_thread(stopmachine, 1)
Process <n> is preempted
stopmachine() runs on cpu 1
STOPMACHINE_PREPARE
STOPMACHINE_DISABLE_IRQ
do_stop() calls smdata->fn
smdata->fn changes global data
restart_machine()
STOPMACHINE_EXIT
stopmachine() exits
Scheduler resumes process <n>
The global resource is out of sync
>(or take cpu_hotplug_lock) when relying on
>cpu_online_map.
There is a lot of code in the kernel that runs cpu_online_map without
taking any locks and without disabling preemption. Obviously we do not
want all that code to lock or disable preemption, it will kill
scalability. The lack of locking on the read side means that changes
to cpu_online_map have to be done under stopmachine, which brings in
the race again.
I will have a look at your scheduler patch and see if it fixes the
race.
next prev parent reply other threads:[~2006-08-28 6:37 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-08-27 9:42 Is stopmachine() preempt safe? Keith Owens
2006-08-27 23:38 ` Rusty Russell
2006-08-28 2:55 ` Keith Owens
2006-08-28 6:17 ` Rusty Russell
2006-08-28 6:36 ` Keith Owens [this message]
2006-08-28 9:21 ` Rusty Russell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=10518.1156747016@kao2.melbourne.sgi.com \
--to=kaos@ocs.com.au \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=rusty@rustcorp.com.au \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.