From: Zefan Li <lizefan@huawei.com>
To: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: peterz@infradead.org, mingo@redhat.com, tj@kernel.org,
akpm@linux-foundation.org, cgroups@vger.kernel.org,
linux-kernel@vger.kernel.org, fernando_b1@lab.ntt.co.jp
Subject: Re: Racy manipulation of task_struct->flags in cgroups code causes hard to reproduce kernel panics
Date: Sat, 20 Sep 2014 13:55:54 +0800 [thread overview]
Message-ID: <541D16EA.70407@huawei.com> (raw)
In-Reply-To: <201409192053.IHJ35462.JLOMOSOFFVtQFH@I-love.SAKURA.ne.jp>
> Then, what made current->flags to unexpectedly preserve PF_USED_MATH flag?
> The user is running cgrulesengd process in order to utilize cpuset cgroup.
> Thus, cpuset_update_task_spread_flag() is called when cgrulesengd process
> writes someone's pid to /cgroup/cpuset/$group/tasks interface.
>
> cpuset_update_task_spread_flag() is updating other thread's
> "struct task_struct"->flags without exclusion control or atomic
> operations!
>
> ---------- linux-2.6.32-358.23.2.el6/kernel/cpuset.c ----------
> 300:/*
> 301: * update task's spread flag if cpuset's page/slab spread flag is set
> 302: *
> 303: * Called with callback_mutex/cgroup_mutex held
> 304: */
> 305:static void cpuset_update_task_spread_flag(struct cpuset *cs,
> 306: struct task_struct *tsk)
> 307:{
> 308: if (is_spread_page(cs))
> 309: tsk->flags |= PF_SPREAD_PAGE;
> 310: else
> 311: tsk->flags &= ~PF_SPREAD_PAGE;
> 312: if (is_spread_slab(cs))
> 313: tsk->flags |= PF_SPREAD_SLAB;
> 314: else
> 315: tsk->flags &= ~PF_SPREAD_SLAB;
> 316:}
We should make the updating of this flag atomic.
diff --git a/include/linux/cpuset.h b/include/linux/cpuset.h
index 0d4e067..2f073db 100644
--- a/include/linux/cpuset.h
+++ b/include/linux/cpuset.h
@@ -94,12 +94,12 @@ extern int cpuset_slab_spread_node(void);
static inline int cpuset_do_page_mem_spread(void)
{
- return current->flags & PF_SPREAD_PAGE;
+ return task_spread_page(current);
}
static inline int cpuset_do_slab_mem_spread(void)
{
- return current->flags & PF_SPREAD_SLAB;
+ return task_spread_slab(current);
}
extern int current_cpuset_is_being_rebound(void);
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 5c2c885..1e448a3 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1903,8 +1903,6 @@ extern void thread_group_cputime_adjusted(struct task_struct *p, cputime_t *ut,
#define PF_KTHREAD 0x00200000 /* I am a kernel thread */
#define PF_RANDOMIZE 0x00400000 /* randomize virtual address space */
#define PF_SWAPWRITE 0x00800000 /* Allowed to write to swap */
-#define PF_SPREAD_PAGE 0x01000000 /* Spread page cache over cpuset */
-#define PF_SPREAD_SLAB 0x02000000 /* Spread some slab caches over cpuset */
#define PF_NO_SETAFFINITY 0x04000000 /* Userland is not allowed to meddle with cpus_allowed */
#define PF_MCE_EARLY 0x08000000 /* Early kill for mce process policy */
#define PF_MUTEX_TESTER 0x20000000 /* Thread belongs to the rt mutex tester */
@@ -1958,6 +1956,8 @@ static inline void memalloc_noio_restore(unsigned int flags)
/* Per-process atomic flags. */
#define PFA_NO_NEW_PRIVS 0x00000001 /* May not gain new privileges. */
+#define PFA_SPREAD_PAGE 0x00000002 /* Spread page cache over cpuset */
+#define PFA_SPREAD_SLAB 0x00000004 /* Spread some slab caches over cpuset */
static inline bool task_no_new_privs(struct task_struct *p)
{
@@ -1969,6 +1969,36 @@ static inline void task_set_no_new_privs(struct task_struct *p)
set_bit(PFA_NO_NEW_PRIVS, &p->atomic_flags);
}
+static inline bool task_spread_page(struct task_struct *p)
+{
+ return test_bit(PFA_SPREAD_PAGE, &p->atomic_flags);
+}
+
+static inline void task_set_spread_page(struct task_struct *p)
+{
+ set_bit(PFA_SPREAD_PAGE, &p->atomic_flags);
+}
+
+static inline void task_clear_spread_page(struct task_struct *p)
+{
+ clear_bit(PFA_SPREAD_PAGE, &p->atomic_flags);
+}
+
+static inline bool task_spread_slab(struct task_struct *p)
+{
+ return test_bit(PFA_SPREAD_SLAB, &p->atomic_flags);
+}
+
+static inline void task_set_spread_slab(struct task_struct *p)
+{
+ set_bit(PFA_SPREAD_SLAB, &p->atomic_flags);
+}
+
+static inline void task_clear_spread_slab(struct task_struct *p)
+{
+ clear_bit(PFA_SPREAD_SLAB, &p->atomic_flags);
+}
+
/*
* task->jobctl flags
*/
diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index a37f4ed..1f107c7 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -365,13 +365,14 @@ static void cpuset_update_task_spread_flag(struct cpuset *cs,
struct task_struct *tsk)
{
if (is_spread_page(cs))
- tsk->flags |= PF_SPREAD_PAGE;
+ task_set_spread_page(tsk);
else
- tsk->flags &= ~PF_SPREAD_PAGE;
+ task_clear_spread_page(tsk);
+
if (is_spread_slab(cs))
- tsk->flags |= PF_SPREAD_SLAB;
+ task_set_spread_slab(tsk);
else
- tsk->flags &= ~PF_SPREAD_SLAB;
+ task_clear_spread_slab(tsk);
}
/*
WARNING: multiple messages have this Message-ID (diff)
From: Zefan Li <lizefan@huawei.com>
To: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: <peterz@infradead.org>, <mingo@redhat.com>, <tj@kernel.org>,
<akpm@linux-foundation.org>, <cgroups@vger.kernel.org>,
<linux-kernel@vger.kernel.org>, <fernando_b1@lab.ntt.co.jp>
Subject: Re: Racy manipulation of task_struct->flags in cgroups code causes hard to reproduce kernel panics
Date: Sat, 20 Sep 2014 13:55:54 +0800 [thread overview]
Message-ID: <541D16EA.70407@huawei.com> (raw)
In-Reply-To: <201409192053.IHJ35462.JLOMOSOFFVtQFH@I-love.SAKURA.ne.jp>
> Then, what made current->flags to unexpectedly preserve PF_USED_MATH flag?
> The user is running cgrulesengd process in order to utilize cpuset cgroup.
> Thus, cpuset_update_task_spread_flag() is called when cgrulesengd process
> writes someone's pid to /cgroup/cpuset/$group/tasks interface.
>
> cpuset_update_task_spread_flag() is updating other thread's
> "struct task_struct"->flags without exclusion control or atomic
> operations!
>
> ---------- linux-2.6.32-358.23.2.el6/kernel/cpuset.c ----------
> 300:/*
> 301: * update task's spread flag if cpuset's page/slab spread flag is set
> 302: *
> 303: * Called with callback_mutex/cgroup_mutex held
> 304: */
> 305:static void cpuset_update_task_spread_flag(struct cpuset *cs,
> 306: struct task_struct *tsk)
> 307:{
> 308: if (is_spread_page(cs))
> 309: tsk->flags |= PF_SPREAD_PAGE;
> 310: else
> 311: tsk->flags &= ~PF_SPREAD_PAGE;
> 312: if (is_spread_slab(cs))
> 313: tsk->flags |= PF_SPREAD_SLAB;
> 314: else
> 315: tsk->flags &= ~PF_SPREAD_SLAB;
> 316:}
We should make the updating of this flag atomic.
diff --git a/include/linux/cpuset.h b/include/linux/cpuset.h
index 0d4e067..2f073db 100644
--- a/include/linux/cpuset.h
+++ b/include/linux/cpuset.h
@@ -94,12 +94,12 @@ extern int cpuset_slab_spread_node(void);
static inline int cpuset_do_page_mem_spread(void)
{
- return current->flags & PF_SPREAD_PAGE;
+ return task_spread_page(current);
}
static inline int cpuset_do_slab_mem_spread(void)
{
- return current->flags & PF_SPREAD_SLAB;
+ return task_spread_slab(current);
}
extern int current_cpuset_is_being_rebound(void);
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 5c2c885..1e448a3 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1903,8 +1903,6 @@ extern void thread_group_cputime_adjusted(struct task_struct *p, cputime_t *ut,
#define PF_KTHREAD 0x00200000 /* I am a kernel thread */
#define PF_RANDOMIZE 0x00400000 /* randomize virtual address space */
#define PF_SWAPWRITE 0x00800000 /* Allowed to write to swap */
-#define PF_SPREAD_PAGE 0x01000000 /* Spread page cache over cpuset */
-#define PF_SPREAD_SLAB 0x02000000 /* Spread some slab caches over cpuset */
#define PF_NO_SETAFFINITY 0x04000000 /* Userland is not allowed to meddle with cpus_allowed */
#define PF_MCE_EARLY 0x08000000 /* Early kill for mce process policy */
#define PF_MUTEX_TESTER 0x20000000 /* Thread belongs to the rt mutex tester */
@@ -1958,6 +1956,8 @@ static inline void memalloc_noio_restore(unsigned int flags)
/* Per-process atomic flags. */
#define PFA_NO_NEW_PRIVS 0x00000001 /* May not gain new privileges. */
+#define PFA_SPREAD_PAGE 0x00000002 /* Spread page cache over cpuset */
+#define PFA_SPREAD_SLAB 0x00000004 /* Spread some slab caches over cpuset */
static inline bool task_no_new_privs(struct task_struct *p)
{
@@ -1969,6 +1969,36 @@ static inline void task_set_no_new_privs(struct task_struct *p)
set_bit(PFA_NO_NEW_PRIVS, &p->atomic_flags);
}
+static inline bool task_spread_page(struct task_struct *p)
+{
+ return test_bit(PFA_SPREAD_PAGE, &p->atomic_flags);
+}
+
+static inline void task_set_spread_page(struct task_struct *p)
+{
+ set_bit(PFA_SPREAD_PAGE, &p->atomic_flags);
+}
+
+static inline void task_clear_spread_page(struct task_struct *p)
+{
+ clear_bit(PFA_SPREAD_PAGE, &p->atomic_flags);
+}
+
+static inline bool task_spread_slab(struct task_struct *p)
+{
+ return test_bit(PFA_SPREAD_SLAB, &p->atomic_flags);
+}
+
+static inline void task_set_spread_slab(struct task_struct *p)
+{
+ set_bit(PFA_SPREAD_SLAB, &p->atomic_flags);
+}
+
+static inline void task_clear_spread_slab(struct task_struct *p)
+{
+ clear_bit(PFA_SPREAD_SLAB, &p->atomic_flags);
+}
+
/*
* task->jobctl flags
*/
diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index a37f4ed..1f107c7 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -365,13 +365,14 @@ static void cpuset_update_task_spread_flag(struct cpuset *cs,
struct task_struct *tsk)
{
if (is_spread_page(cs))
- tsk->flags |= PF_SPREAD_PAGE;
+ task_set_spread_page(tsk);
else
- tsk->flags &= ~PF_SPREAD_PAGE;
+ task_clear_spread_page(tsk);
+
if (is_spread_slab(cs))
- tsk->flags |= PF_SPREAD_SLAB;
+ task_set_spread_slab(tsk);
else
- tsk->flags &= ~PF_SPREAD_SLAB;
+ task_clear_spread_slab(tsk);
}
/*
next prev parent reply other threads:[~2014-09-20 5:55 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-09-19 11:53 Racy manipulation of task_struct->flags in cgroups code causes hard to reproduce kernel panics Tetsuo Handa
2014-09-19 11:53 ` Tetsuo Handa
[not found] ` <201409192053.IHJ35462.JLOMOSOFFVtQFH-JPay3/Yim36HaxMnTkn67Xf5DAMn2ifp@public.gmane.org>
2014-09-19 22:21 ` Peter Zijlstra
2014-09-19 22:21 ` Peter Zijlstra
2014-09-20 5:55 ` Zefan Li [this message]
2014-09-20 5:55 ` Zefan Li
2014-09-20 10:40 ` [PATCH 3.17-rc5] Fix confusing PFA_NO_NEW_PRIVS constant Tetsuo Handa
2014-09-20 10:40 ` Tetsuo Handa
2014-09-20 10:40 ` Tetsuo Handa
[not found] ` <201409201940.AHG21834.LJOFFHSFQOtVMO-JPay3/Yim36HaxMnTkn67Xf5DAMn2ifp@public.gmane.org>
2014-09-20 17:19 ` Kees Cook
2014-09-20 17:19 ` Kees Cook
2014-09-20 17:19 ` Kees Cook
[not found] ` <541D16EA.70407-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2014-09-20 14:30 ` Racy manipulation of task_struct->flags in cgroups code causes hard to reproduce kernel panics Peter Zijlstra
2014-09-20 14:30 ` Peter Zijlstra
[not found] ` <20140920143012.GL2832-IIpfhp3q70wB9AHHLWeGtNQXobZC6xk2@public.gmane.org>
2014-09-20 17:15 ` Kees Cook
2014-09-20 17:15 ` Kees Cook
[not found] ` <CAGXu5j+P_kcgZuqYsemgL0KU_zRhz5HGJ6seh2oLyyst=cZP6g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-09-20 18:04 ` Peter Zijlstra
2014-09-20 18:04 ` Peter Zijlstra
2014-09-20 17:28 ` Tejun Heo
2014-09-20 17:28 ` Tejun Heo
[not found] ` <20140920172819.GD3681-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2014-09-21 5:15 ` Tetsuo Handa
2014-09-21 5:15 ` Tetsuo Handa
[not found] ` <201409211415.GJG26578.MFQOHtSFVJLOOF-JPay3/Yim36HaxMnTkn67Xf5DAMn2ifp@public.gmane.org>
2014-09-22 2:15 ` Zefan Li
2014-09-22 2:15 ` Zefan Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=541D16EA.70407@huawei.com \
--to=lizefan@huawei.com \
--cc=akpm@linux-foundation.org \
--cc=cgroups@vger.kernel.org \
--cc=fernando_b1@lab.ntt.co.jp \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=penguin-kernel@I-love.SAKURA.ne.jp \
--cc=peterz@infradead.org \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.