From: Oleg Nesterov <oleg@redhat.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
Andi Kleen <andi@firstfloor.org>,
Lucas De Marchi <lucas.de.marchi@gmail.com>,
Benjamin Herrenschmidt <benh@kernel.crashing.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Paul Mackerras <paulus@samba.org>,
david@gibson.dropbear.id.au, Kees Cook <keescook@chromium.org>,
Serge Hallyn <serge.hallyn@canonical.com>,
"Rafael J. Wysocki" <rjw@sisk.pl>,
Feng Hong <hongfeng@marvell.com>,
Lucas De Marchi <lucas.demarchi@profusion.mobi>
Subject: Re: [PATCH 1/1] poweroff: change orderly_poweroff() to use schedule_work()
Date: Fri, 15 Mar 2013 17:39:16 +0100 [thread overview]
Message-ID: <20130315163916.GA31995@redhat.com> (raw)
In-Reply-To: <20130314152819.7fb1242b493e8bad2d34671b@linux-foundation.org>
On 03/14, Andrew Morton wrote:
>
> On Wed, 13 Mar 2013 18:47:05 +0100 Oleg Nesterov <oleg@redhat.com> wrote:
>
> > This means that orderly_poweroff() becomes async even if we do not
> > run the command and always succeeds, schedule_work() can only fail
> > if the work is already pending. We can export __orderly_poweroff()
> > and change the non-atomic callers which want the old semantics.
> >
> > ...
> >
> > @@ -2218,21 +2237,9 @@ static int __orderly_poweroff(void)
> > */
> > int orderly_poweroff(bool force)
> > {
> > + if (force) /* do not override the pending "true" */
> > + poweroff_force = true;
> > + schedule_work(&poweroff_work);
> > + return 0;
> > }
>
> afaict the current version of orderly_poweroff() will never return -
> either __orderly_poweroff() will block until the machine shuts down or
> kernel_power_off() will do so.
Note that __orderly_poweroff() uses UMH_WAIT_EXEC, not UMH_WAIT_PROC,
so it returns right after /sbin/poweroff starts to execute.
So it is already asynchronous unless execve() fails.
> However with this patch there is a path via which orderly_poweroff()
> can return to its caller, I think?
See above, but please also read the changelog.
With this patch orderly_poweroff() is always async, even if exec fails,
but
> If so, the caller might be rather
> surprised and we're exercising never-before-used code paths. In fact
> if the surprised caller goes oops, the poweroff might not occur at all.
This should not happen.
Anyway. Please also note that now we can export __orderly_poweroff() and
probably change it, it can have another argument "bool use_UMH_WAIT_PROC".
int __orderly_poweroff(bool force, bool sync)
{
int wait = sync ? UMH_WAIT_EXEC : UMH_WAIT_EXEC;
ret = call_usermodehelper(argv[0], argv, envp, wait);
if (force) {
// EXEC failed or /sbin/poweroff didn't do its work
if (ret || sync)
kernel_power_off();
}
}
The non-atomic callers can use __orderly_poweroff(sync => true).
---------------------------------------------------------------------------
And, Andrew, et all... Could you help with another mentioned problem? It is
really trivial, but exactly because it is trivial I do not know what should
I do.
To remind, say, argv_split(poweroff_cmd) can race with sysctl changing this
string, in this case it can write to the memory after argv[] array. We can
fix this, or we can rewrite argv_split/free:
void argv_free(char **argv)
{
kfree(argv[-1]);
kfree(argv);
}
char **argv_split(gfp_t gfp, const char *str, int *argcp)
{
char *argv_str;
bool was_space;
char **argv, **argv_ret;
int argc;
argv_str = kstrndup(str, KMALLOC_MAX_SIZE, gfp);
if (!argv_str)
return NULL;
argc = count_argc(argv_str);
argv = kmalloc(sizeof(*argv) * (argc + 2), gfp);
if (!argv) {
kfree(argv_str);
return NULL;
}
*argv = argv_str;
argv_ret = ++argv;
for (was_space = true; *argv_str; argv_str++) {
if (isspace(*argv_str)) {
was_space = true;
*argv_str = 0;
} else if (was_space) {
was_space = false;
*argv++ = argv_str;
}
}
*argv = NULL;
if (argcp)
*argcp = argc;
return argv_ret;
}
This way it uses a single kstrndup() to keep the arguments and it is
always safe.
But, whatever we do with argv_split(), it can hit the string "in between".
Personally I think we do not really care, but...
Perhaps we should add proc_dostring_lock() which takes some lock and
modify the callers of argv_split() (or add argv_split_lock) ?
Or perhaps we should introduce the rwsem which should protect every
sysctl-string and proc_dostring() should take this lock?
Help! I'd prefer to rewrite argv_split(), but I agree with any suggestion
in advance.
Oleg.
next prev parent reply other threads:[~2013-03-15 16:41 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-03-12 3:25 Regression with orderly_poweroff() Benjamin Herrenschmidt
2013-03-12 14:46 ` Linus Torvalds
2013-03-12 17:46 ` Oleg Nesterov
2013-03-12 17:54 ` Lucas De Marchi
2013-03-12 18:22 ` Oleg Nesterov
2013-03-12 18:42 ` Linus Torvalds
2013-03-12 19:11 ` Oleg Nesterov
2013-03-12 19:20 ` Linus Torvalds
2013-03-12 20:35 ` Oleg Nesterov
2013-03-13 17:46 ` [PATCH 0/1] poweroff: change orderly_poweroff() to use schedule_work() Oleg Nesterov
2013-03-13 17:47 ` [PATCH 1/1] " Oleg Nesterov
2013-03-14 22:28 ` Andrew Morton
2013-03-15 16:39 ` Oleg Nesterov [this message]
2013-03-16 20:23 ` [PATCH 0/2] finx argv_split() vs sysctl race Oleg Nesterov
2013-03-16 20:23 ` [PATCH 1/2] teach argv_split() to handle the mutable strings Oleg Nesterov
2013-03-18 16:03 ` [PATCH v2 " Oleg Nesterov
2013-03-18 21:53 ` [PATCH " Andrew Morton
2013-03-19 19:54 ` [PATCH -mm] argv_split-teach-it-to-handle-mutable-strings-fix-2 Oleg Nesterov
2013-03-16 20:24 ` [PATCH 2/2] set_task_comm: kill the pointless memset() + wmb() Oleg Nesterov
2013-03-16 20:32 ` [PATCH 0/2] finx argv_split() vs sysctl race Andi Kleen
2013-03-16 20:45 ` Oleg Nesterov
2013-03-16 20:56 ` Andi Kleen
2013-03-16 21:23 ` Oleg Nesterov
2013-03-16 21:54 ` Andi Kleen
2013-03-17 14:15 ` Oleg Nesterov
2013-03-18 16:03 ` Oleg Nesterov
2013-03-13 23:35 ` [PATCH 0/1] poweroff: change orderly_poweroff() to use schedule_work() Lucas De Marchi
2013-03-12 20:13 ` Regression with orderly_poweroff() Andi Kleen
2013-03-12 19:28 ` Benjamin Herrenschmidt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130315163916.GA31995@redhat.com \
--to=oleg@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=andi@firstfloor.org \
--cc=benh@kernel.crashing.org \
--cc=david@gibson.dropbear.id.au \
--cc=hongfeng@marvell.com \
--cc=keescook@chromium.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lucas.de.marchi@gmail.com \
--cc=lucas.demarchi@profusion.mobi \
--cc=paulus@samba.org \
--cc=rjw@sisk.pl \
--cc=serge.hallyn@canonical.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox