* [PATCH v3 1/3] sched: fix confusing PFA_NO_NEW_PRIVS constant
@ 2014-09-23 7:44 Zefan Li
2014-09-23 7:44 ` [PATCH v3 2/3] sched: add macros to define bitops for task atomic flags Zefan Li
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: Zefan Li @ 2014-09-23 7:44 UTC (permalink / raw)
To: Tejun Heo
Cc: Peter Zijlstra, Ingo Molnar, Kees Cook, Miao Xie, Tetsuo Handa,
LKML, Cgroups
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Commit 1d4457f99928 ("sched: move no_new_privs into new atomic flags")
defined PFA_NO_NEW_PRIVS as hexadecimal value, but it is confusing
because it is used as bit number. Redefine it as decimal bit number.
Note this changes the bit position of PFA_NOW_NEW_PRIVS from 1 to 0.
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Miao Xie <miaox@cn.fujitsu.com>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Kees Cook <keescook@chromium.org>
[ lizf: slightly modified subject and changelog ]
Signed-off-by: Zefan Li <lizefan@huawei.com>
---
include/linux/sched.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 5c2c885..4557765 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1957,7 +1957,7 @@ static inline void memalloc_noio_restore(unsigned int flags)
}
/* Per-process atomic flags. */
-#define PFA_NO_NEW_PRIVS 0x00000001 /* May not gain new privileges. */
+#define PFA_NO_NEW_PRIVS 0 /* May not gain new privileges. */
static inline bool task_no_new_privs(struct task_struct *p)
{
--
1.8.0.2
^ permalink raw reply related [flat|nested] 12+ messages in thread* [PATCH v3 2/3] sched: add macros to define bitops for task atomic flags 2014-09-23 7:44 [PATCH v3 1/3] sched: fix confusing PFA_NO_NEW_PRIVS constant Zefan Li @ 2014-09-23 7:44 ` Zefan Li [not found] ` <542124FB.4030109-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> 2014-09-23 7:45 ` [PATCH v3 3/3] cpuset: PF_SPREAD_PAGE and PF_SPREAD_SLAB should be " Zefan Li [not found] ` <542124D3.9000007-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> 2 siblings, 1 reply; 12+ messages in thread From: Zefan Li @ 2014-09-23 7:44 UTC (permalink / raw) To: Tejun Heo Cc: Peter Zijlstra, Ingo Molnar, Kees Cook, Miao Xie, Tetsuo Handa, LKML, Cgroups This will simplify code when we add new flags. v3: - Kees pointed out that no_new_privs should never be cleared, so we shouldn't define task_clear_no_new_privs(). we define 3 macros instead of a single one. v2: - updated scripts/tags.sh, suggested by Peter. Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Miao Xie <miaox@cn.fujitsu.com> Cc: Kees Cook <keescook@chromium.org> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Zefan Li <lizefan@huawei.com> --- include/linux/sched.h | 21 ++++++++++++--------- scripts/tags.sh | 6 ++++++ 2 files changed, 18 insertions(+), 9 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index 4557765..5630763 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1959,15 +1959,18 @@ static inline void memalloc_noio_restore(unsigned int flags) /* Per-process atomic flags. */ #define PFA_NO_NEW_PRIVS 0 /* May not gain new privileges. */ -static inline bool task_no_new_privs(struct task_struct *p) -{ - return test_bit(PFA_NO_NEW_PRIVS, &p->atomic_flags); -} - -static inline void task_set_no_new_privs(struct task_struct *p) -{ - set_bit(PFA_NO_NEW_PRIVS, &p->atomic_flags); -} +#define TASK_PFA_TEST(name, func) \ + static inline bool task_##func(struct task_struct *p) \ + { return test_bit(PFA_##name, &p->atomic_flags); } +#define TASK_PFA_SET(name, func) \ + static inline void task_set_##func(struct task_struct *p) \ + { set_bit(PFA_##name, &p->atomic_flags); } +#define TASK_PFA_CLEAR(name, func) \ + static inline void task_clear_##func(struct task_struct *p) \ + { clear_bit(PFA_##name, &p->atomic_flags); } + +TASK_PFA_TEST(NO_NEW_PRIVS, no_new_privs) +TASK_PFA_SET(NO_NEW_PRIVS, no_new_privs) /* * task->jobctl flags diff --git a/scripts/tags.sh b/scripts/tags.sh index cbfd269..293828b 100755 --- a/scripts/tags.sh +++ b/scripts/tags.sh @@ -197,6 +197,9 @@ exuberant() --regex-c++='/SETPCGFLAG\(([^,)]*).*/SetPageCgroup\1/' \ --regex-c++='/CLEARPCGFLAG\(([^,)]*).*/ClearPageCgroup\1/' \ --regex-c++='/TESTCLEARPCGFLAG\(([^,)]*).*/TestClearPageCgroup\1/' \ + --regex-c++='/TASK_PFA_TEST\([^,]*,\s*([^)]*)\)/task_\1/' \ + --regex-c++='/TASK_PFA_SET\([^,]*,\s*([^)]*)\)/task_set_\1/' \ + --regex-c++='/TASK_PFA_CLEAR\([^,]*,\s*([^)]*)\)/task_clear_\1/'\ --regex-c='/PCI_OP_READ\((\w*).*[1-4]\)/pci_bus_read_config_\1/' \ --regex-c='/PCI_OP_WRITE\((\w*).*[1-4]\)/pci_bus_write_config_\1/' \ --regex-c='/DEFINE_(MUTEX|SEMAPHORE|SPINLOCK)\((\w*)/\2/v/' \ @@ -260,6 +263,9 @@ emacs() --regex='/SETPCGFLAG\(([^,)]*).*/SetPageCgroup\1/' \ --regex='/CLEARPCGFLAG\(([^,)]*).*/ClearPageCgroup\1/' \ --regex='/TESTCLEARPCGFLAG\(([^,)]*).*/TestClearPageCgroup\1/' \ + --regex='/TASK_PFA_TEST\([^,]*,\s*([^)]*)\)/task_\1/' \ + --regex='/TASK_PFA_SET\([^,]*,\s*([^)]*)\)/task_set_\1/' \ + --regex='/TASK_PFA_CLEAR\([^,]*,\s*([^)]*)\)/task_clear_\1/' \ --regex='/_PE(\([^,)]*\).*/PEVENT_ERRNO__\1/' \ --regex='/PCI_OP_READ(\([a-z]*[a-z]\).*[1-4])/pci_bus_read_config_\1/' \ --regex='/PCI_OP_WRITE(\([a-z]*[a-z]\).*[1-4])/pci_bus_write_config_\1/'\ -- 1.8.0.2 ^ permalink raw reply related [flat|nested] 12+ messages in thread
[parent not found: <542124FB.4030109-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH v3 2/3] sched: add macros to define bitops for task atomic flags [not found] ` <542124FB.4030109-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> @ 2014-09-23 10:01 ` Peter Zijlstra [not found] ` <20140923100127.GE3312-IIpfhp3q70z/8w/KjCw3T+5/BudmfyzbbVWyRVo5IupeoWH0uzbU5w@public.gmane.org> 0 siblings, 1 reply; 12+ messages in thread From: Peter Zijlstra @ 2014-09-23 10:01 UTC (permalink / raw) To: Zefan Li Cc: Tejun Heo, Ingo Molnar, Kees Cook, Miao Xie, Tetsuo Handa, LKML, Cgroups On Tue, Sep 23, 2014 at 03:44:59PM +0800, Zefan Li wrote: > This will simplify code when we add new flags. > > v3: > - Kees pointed out that no_new_privs should never be cleared, so we > shouldn't define task_clear_no_new_privs(). we define 3 macros instead > of a single one. > > v2: > - updated scripts/tags.sh, suggested by Peter. > > Cc: Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> > Cc: Ingo Molnar <mingo-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> > Cc: Miao Xie <miaox-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org> > Cc: Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> > Cc: Tetsuo Handa <penguin-kernel-JPay3/Yim36HaxMnTkn67Xf5DAMn2ifp@public.gmane.org> > Signed-off-by: Zefan Li <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> Acked-by: Peter Zijlstra (Intel) <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <20140923100127.GE3312-IIpfhp3q70z/8w/KjCw3T+5/BudmfyzbbVWyRVo5IupeoWH0uzbU5w@public.gmane.org>]
* Re: [PATCH v3 2/3] sched: add macros to define bitops for task atomic flags [not found] ` <20140923100127.GE3312-IIpfhp3q70z/8w/KjCw3T+5/BudmfyzbbVWyRVo5IupeoWH0uzbU5w@public.gmane.org> @ 2014-09-23 16:58 ` Kees Cook 0 siblings, 0 replies; 12+ messages in thread From: Kees Cook @ 2014-09-23 16:58 UTC (permalink / raw) To: Peter Zijlstra Cc: Zefan Li, Tejun Heo, Ingo Molnar, Miao Xie, Tetsuo Handa, LKML, Cgroups On Tue, Sep 23, 2014 at 3:01 AM, Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> wrote: > On Tue, Sep 23, 2014 at 03:44:59PM +0800, Zefan Li wrote: >> This will simplify code when we add new flags. >> >> v3: >> - Kees pointed out that no_new_privs should never be cleared, so we >> shouldn't define task_clear_no_new_privs(). we define 3 macros instead >> of a single one. Thanks! This looks good to me. >> >> v2: >> - updated scripts/tags.sh, suggested by Peter. >> >> Cc: Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> >> Cc: Ingo Molnar <mingo-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> >> Cc: Miao Xie <miaox-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org> >> Cc: Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> >> Cc: Tetsuo Handa <penguin-kernel-JPay3/Yim36HaxMnTkn67Xf5DAMn2ifp@public.gmane.org> >> Signed-off-by: Zefan Li <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> > > Acked-by: Peter Zijlstra (Intel) <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> Acked-by: Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> -Kees -- Kees Cook Chrome OS Security ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v3 3/3] cpuset: PF_SPREAD_PAGE and PF_SPREAD_SLAB should be atomic flags 2014-09-23 7:44 [PATCH v3 1/3] sched: fix confusing PFA_NO_NEW_PRIVS constant Zefan Li 2014-09-23 7:44 ` [PATCH v3 2/3] sched: add macros to define bitops for task atomic flags Zefan Li @ 2014-09-23 7:45 ` Zefan Li 2014-09-23 22:10 ` David Rientjes [not found] ` <542124D3.9000007-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> 2 siblings, 1 reply; 12+ messages in thread From: Zefan Li @ 2014-09-23 7:45 UTC (permalink / raw) To: Tejun Heo Cc: Peter Zijlstra, Ingo Molnar, Kees Cook, Miao Xie, Tetsuo Handa, LKML, Cgroups When we change cpuset.memory_spread_{page,slab}, cpuset will flip PF_SPREAD_{PAGE,SLAB} bit of tsk->flags for each task in that cpuset. This should be done using atomic bitops, but currently we don't, which is broken. Tetsuo reported a hard-to-reproduce kernel crash on RHEL6, which happend when one thread tried to clear PF_USED_MATH while at the same time another thread tried to flip PF_SPREAD_PAGE/PF_SPREAD_SLAB. They both operate on the same task. Here's the full report: https://lkml.org/lkml/2014/9/19/230 To fix this, we make PF_SPREAD_PAGE and PF_SPARED_SLAB atomic flags. Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Miao Xie <miaox@cn.fujitsu.com> Cc: Kees Cook <keescook@chromium.org> Fixes: 950592f7b991 ("cpusets: update tasks' page/slab spread flags in time") Cc: <stable@vger.kernel.org> # 2.6.31+ Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Zefan Li <lizefan@huawei.com> --- include/linux/cpuset.h | 4 ++-- include/linux/sched.h | 13 +++++++++++-- kernel/cpuset.c | 9 +++++---- 3 files changed, 18 insertions(+), 8 deletions(-) diff --git a/include/linux/cpuset.h b/include/linux/cpuset.h index 0d4e067..2f073db 100644 --- a/include/linux/cpuset.h +++ b/include/linux/cpuset.h @@ -94,12 +94,12 @@ extern int cpuset_slab_spread_node(void); static inline int cpuset_do_page_mem_spread(void) { - return current->flags & PF_SPREAD_PAGE; + return task_spread_page(current); } static inline int cpuset_do_slab_mem_spread(void) { - return current->flags & PF_SPREAD_SLAB; + return task_spread_slab(current); } extern int current_cpuset_is_being_rebound(void); diff --git a/include/linux/sched.h b/include/linux/sched.h index 5630763..7b1cafe 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1903,8 +1903,6 @@ extern void thread_group_cputime_adjusted(struct task_struct *p, cputime_t *ut, #define PF_KTHREAD 0x00200000 /* I am a kernel thread */ #define PF_RANDOMIZE 0x00400000 /* randomize virtual address space */ #define PF_SWAPWRITE 0x00800000 /* Allowed to write to swap */ -#define PF_SPREAD_PAGE 0x01000000 /* Spread page cache over cpuset */ -#define PF_SPREAD_SLAB 0x02000000 /* Spread some slab caches over cpuset */ #define PF_NO_SETAFFINITY 0x04000000 /* Userland is not allowed to meddle with cpus_allowed */ #define PF_MCE_EARLY 0x08000000 /* Early kill for mce process policy */ #define PF_MUTEX_TESTER 0x20000000 /* Thread belongs to the rt mutex tester */ @@ -1958,6 +1956,9 @@ static inline void memalloc_noio_restore(unsigned int flags) /* Per-process atomic flags. */ #define PFA_NO_NEW_PRIVS 0 /* May not gain new privileges. */ +#define PFA_SPREAD_PAGE 1 /* Spread page cache over cpuset */ +#define PFA_SPREAD_SLAB 2 /* Spread some slab caches over cpuset */ + #define TASK_PFA_TEST(name, func) \ static inline bool task_##func(struct task_struct *p) \ @@ -1972,6 +1973,14 @@ static inline void memalloc_noio_restore(unsigned int flags) TASK_PFA_TEST(NO_NEW_PRIVS, no_new_privs) TASK_PFA_SET(NO_NEW_PRIVS, no_new_privs) +TASK_PFA_TEST(SPREAD_PAGE, spread_page) +TASK_PFA_SET(SPREAD_PAGE, spread_page) +TASK_PFA_CLEAR(SPREAD_PAGE, spread_page) + +TASK_PFA_TEST(SPREAD_SLAB, spread_slab) +TASK_PFA_SET(SPREAD_SLAB, spread_slab) +TASK_PFA_CLEAR(SPREAD_SLAB, spread_slab) + /* * task->jobctl flags */ diff --git a/kernel/cpuset.c b/kernel/cpuset.c index a37f4ed..1f107c7 100644 --- a/kernel/cpuset.c +++ b/kernel/cpuset.c @@ -365,13 +365,14 @@ static void cpuset_update_task_spread_flag(struct cpuset *cs, struct task_struct *tsk) { if (is_spread_page(cs)) - tsk->flags |= PF_SPREAD_PAGE; + task_set_spread_page(tsk); else - tsk->flags &= ~PF_SPREAD_PAGE; + task_clear_spread_page(tsk); + if (is_spread_slab(cs)) - tsk->flags |= PF_SPREAD_SLAB; + task_set_spread_slab(tsk); else - tsk->flags &= ~PF_SPREAD_SLAB; + task_clear_spread_slab(tsk); } /* -- 1.8.0.2 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH v3 3/3] cpuset: PF_SPREAD_PAGE and PF_SPREAD_SLAB should be atomic flags 2014-09-23 7:45 ` [PATCH v3 3/3] cpuset: PF_SPREAD_PAGE and PF_SPREAD_SLAB should be " Zefan Li @ 2014-09-23 22:10 ` David Rientjes [not found] ` <alpine.DEB.2.02.1409231508590.22630-X6Q0R45D7oAcqpCFd4KODRPsWskHk0ljAL8bYrjMMd8@public.gmane.org> 0 siblings, 1 reply; 12+ messages in thread From: David Rientjes @ 2014-09-23 22:10 UTC (permalink / raw) To: Zefan Li Cc: Tejun Heo, Peter Zijlstra, Ingo Molnar, Kees Cook, Miao Xie, Tetsuo Handa, LKML, Cgroups On Tue, 23 Sep 2014, Zefan Li wrote: > When we change cpuset.memory_spread_{page,slab}, cpuset will flip > PF_SPREAD_{PAGE,SLAB} bit of tsk->flags for each task in that cpuset. > This should be done using atomic bitops, but currently we don't, > which is broken. > > Tetsuo reported a hard-to-reproduce kernel crash on RHEL6, which happend > when one thread tried to clear PF_USED_MATH while at the same time another > thread tried to flip PF_SPREAD_PAGE/PF_SPREAD_SLAB. They both operate on > the same task. > > Here's the full report: > https://lkml.org/lkml/2014/9/19/230 > > To fix this, we make PF_SPREAD_PAGE and PF_SPARED_SLAB atomic flags. > s/SPARED/SPREAD/ > Cc: Peter Zijlstra <peterz@infradead.org> > Cc: Ingo Molnar <mingo@kernel.org> > Cc: Miao Xie <miaox@cn.fujitsu.com> > Cc: Kees Cook <keescook@chromium.org> > Fixes: 950592f7b991 ("cpusets: update tasks' page/slab spread flags in time") > Cc: <stable@vger.kernel.org> # 2.6.31+ > Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> > Signed-off-by: Zefan Li <lizefan@huawei.com> > --- > include/linux/cpuset.h | 4 ++-- > include/linux/sched.h | 13 +++++++++++-- > kernel/cpuset.c | 9 +++++---- > 3 files changed, 18 insertions(+), 8 deletions(-) > > diff --git a/include/linux/cpuset.h b/include/linux/cpuset.h > index 0d4e067..2f073db 100644 > --- a/include/linux/cpuset.h > +++ b/include/linux/cpuset.h > @@ -94,12 +94,12 @@ extern int cpuset_slab_spread_node(void); > > static inline int cpuset_do_page_mem_spread(void) > { > - return current->flags & PF_SPREAD_PAGE; > + return task_spread_page(current); > } > > static inline int cpuset_do_slab_mem_spread(void) > { > - return current->flags & PF_SPREAD_SLAB; > + return task_spread_slab(current); > } > > extern int current_cpuset_is_being_rebound(void); > diff --git a/include/linux/sched.h b/include/linux/sched.h > index 5630763..7b1cafe 100644 > --- a/include/linux/sched.h > +++ b/include/linux/sched.h > @@ -1903,8 +1903,6 @@ extern void thread_group_cputime_adjusted(struct task_struct *p, cputime_t *ut, > #define PF_KTHREAD 0x00200000 /* I am a kernel thread */ > #define PF_RANDOMIZE 0x00400000 /* randomize virtual address space */ > #define PF_SWAPWRITE 0x00800000 /* Allowed to write to swap */ > -#define PF_SPREAD_PAGE 0x01000000 /* Spread page cache over cpuset */ > -#define PF_SPREAD_SLAB 0x02000000 /* Spread some slab caches over cpuset */ > #define PF_NO_SETAFFINITY 0x04000000 /* Userland is not allowed to meddle with cpus_allowed */ > #define PF_MCE_EARLY 0x08000000 /* Early kill for mce process policy */ > #define PF_MUTEX_TESTER 0x20000000 /* Thread belongs to the rt mutex tester */ > @@ -1958,6 +1956,9 @@ static inline void memalloc_noio_restore(unsigned int flags) > > /* Per-process atomic flags. */ > #define PFA_NO_NEW_PRIVS 0 /* May not gain new privileges. */ > +#define PFA_SPREAD_PAGE 1 /* Spread page cache over cpuset */ > +#define PFA_SPREAD_SLAB 2 /* Spread some slab caches over cpuset */ > + > > #define TASK_PFA_TEST(name, func) \ > static inline bool task_##func(struct task_struct *p) \ > @@ -1972,6 +1973,14 @@ static inline void memalloc_noio_restore(unsigned int flags) > TASK_PFA_TEST(NO_NEW_PRIVS, no_new_privs) > TASK_PFA_SET(NO_NEW_PRIVS, no_new_privs) > > +TASK_PFA_TEST(SPREAD_PAGE, spread_page) > +TASK_PFA_SET(SPREAD_PAGE, spread_page) > +TASK_PFA_CLEAR(SPREAD_PAGE, spread_page) > + > +TASK_PFA_TEST(SPREAD_SLAB, spread_slab) > +TASK_PFA_SET(SPREAD_SLAB, spread_slab) > +TASK_PFA_CLEAR(SPREAD_SLAB, spread_slab) > + > /* > * task->jobctl flags > */ > diff --git a/kernel/cpuset.c b/kernel/cpuset.c > index a37f4ed..1f107c7 100644 > --- a/kernel/cpuset.c > +++ b/kernel/cpuset.c > @@ -365,13 +365,14 @@ static void cpuset_update_task_spread_flag(struct cpuset *cs, > struct task_struct *tsk) > { > if (is_spread_page(cs)) > - tsk->flags |= PF_SPREAD_PAGE; > + task_set_spread_page(tsk); > else > - tsk->flags &= ~PF_SPREAD_PAGE; > + task_clear_spread_page(tsk); > + > if (is_spread_slab(cs)) > - tsk->flags |= PF_SPREAD_SLAB; > + task_set_spread_slab(tsk); > else > - tsk->flags &= ~PF_SPREAD_SLAB; > + task_clear_spread_slab(tsk); > } > > /* This most certainly needs commentary to specify why these have to be atomic ops. ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <alpine.DEB.2.02.1409231508590.22630-X6Q0R45D7oAcqpCFd4KODRPsWskHk0ljAL8bYrjMMd8@public.gmane.org>]
* Re: [PATCH v3 3/3] cpuset: PF_SPREAD_PAGE and PF_SPREAD_SLAB should be atomic flags [not found] ` <alpine.DEB.2.02.1409231508590.22630-X6Q0R45D7oAcqpCFd4KODRPsWskHk0ljAL8bYrjMMd8@public.gmane.org> @ 2014-09-24 3:15 ` Zefan Li 0 siblings, 0 replies; 12+ messages in thread From: Zefan Li @ 2014-09-24 3:15 UTC (permalink / raw) To: David Rientjes Cc: Tejun Heo, Peter Zijlstra, Ingo Molnar, Kees Cook, Miao Xie, Tetsuo Handa, LKML, Cgroups 于 2014/9/24 6:10, David Rientjes wrote: > On Tue, 23 Sep 2014, Zefan Li wrote: > >> When we change cpuset.memory_spread_{page,slab}, cpuset will flip >> PF_SPREAD_{PAGE,SLAB} bit of tsk->flags for each task in that cpuset. >> This should be done using atomic bitops, but currently we don't, >> which is broken. >> >> Tetsuo reported a hard-to-reproduce kernel crash on RHEL6, which happend >> when one thread tried to clear PF_USED_MATH while at the same time another >> thread tried to flip PF_SPREAD_PAGE/PF_SPREAD_SLAB. They both operate on >> the same task. >> >> Here's the full report: >> https://lkml.org/lkml/2014/9/19/230 >> >> To fix this, we make PF_SPREAD_PAGE and PF_SPARED_SLAB atomic flags. >> > > s/SPARED/SPREAD/ > >> Cc: Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> >> Cc: Ingo Molnar <mingo-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> >> Cc: Miao Xie <miaox-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org> >> Cc: Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> >> Fixes: 950592f7b991 ("cpusets: update tasks' page/slab spread flags in time") >> Cc: <stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org> # 2.6.31+ >> Reported-by: Tetsuo Handa <penguin-kernel-JPay3/Yim36HaxMnTkn67Xf5DAMn2ifp@public.gmane.org> >> Signed-off-by: Zefan Li <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> >> --- >> include/linux/cpuset.h | 4 ++-- >> include/linux/sched.h | 13 +++++++++++-- >> kernel/cpuset.c | 9 +++++---- >> 3 files changed, 18 insertions(+), 8 deletions(-) >> >> diff --git a/include/linux/cpuset.h b/include/linux/cpuset.h >> index 0d4e067..2f073db 100644 >> --- a/include/linux/cpuset.h >> +++ b/include/linux/cpuset.h >> @@ -94,12 +94,12 @@ extern int cpuset_slab_spread_node(void); >> >> static inline int cpuset_do_page_mem_spread(void) >> { >> - return current->flags & PF_SPREAD_PAGE; >> + return task_spread_page(current); >> } >> >> static inline int cpuset_do_slab_mem_spread(void) >> { >> - return current->flags & PF_SPREAD_SLAB; >> + return task_spread_slab(current); >> } >> >> extern int current_cpuset_is_being_rebound(void); >> diff --git a/include/linux/sched.h b/include/linux/sched.h >> index 5630763..7b1cafe 100644 >> --- a/include/linux/sched.h >> +++ b/include/linux/sched.h >> @@ -1903,8 +1903,6 @@ extern void thread_group_cputime_adjusted(struct task_struct *p, cputime_t *ut, >> #define PF_KTHREAD 0x00200000 /* I am a kernel thread */ >> #define PF_RANDOMIZE 0x00400000 /* randomize virtual address space */ >> #define PF_SWAPWRITE 0x00800000 /* Allowed to write to swap */ >> -#define PF_SPREAD_PAGE 0x01000000 /* Spread page cache over cpuset */ >> -#define PF_SPREAD_SLAB 0x02000000 /* Spread some slab caches over cpuset */ >> #define PF_NO_SETAFFINITY 0x04000000 /* Userland is not allowed to meddle with cpus_allowed */ >> #define PF_MCE_EARLY 0x08000000 /* Early kill for mce process policy */ >> #define PF_MUTEX_TESTER 0x20000000 /* Thread belongs to the rt mutex tester */ >> @@ -1958,6 +1956,9 @@ static inline void memalloc_noio_restore(unsigned int flags) >> >> /* Per-process atomic flags. */ >> #define PFA_NO_NEW_PRIVS 0 /* May not gain new privileges. */ >> +#define PFA_SPREAD_PAGE 1 /* Spread page cache over cpuset */ >> +#define PFA_SPREAD_SLAB 2 /* Spread some slab caches over cpuset */ >> + >> >> #define TASK_PFA_TEST(name, func) \ >> static inline bool task_##func(struct task_struct *p) \ >> @@ -1972,6 +1973,14 @@ static inline void memalloc_noio_restore(unsigned int flags) >> TASK_PFA_TEST(NO_NEW_PRIVS, no_new_privs) >> TASK_PFA_SET(NO_NEW_PRIVS, no_new_privs) >> >> +TASK_PFA_TEST(SPREAD_PAGE, spread_page) >> +TASK_PFA_SET(SPREAD_PAGE, spread_page) >> +TASK_PFA_CLEAR(SPREAD_PAGE, spread_page) >> + >> +TASK_PFA_TEST(SPREAD_SLAB, spread_slab) >> +TASK_PFA_SET(SPREAD_SLAB, spread_slab) >> +TASK_PFA_CLEAR(SPREAD_SLAB, spread_slab) >> + >> /* >> * task->jobctl flags >> */ >> diff --git a/kernel/cpuset.c b/kernel/cpuset.c >> index a37f4ed..1f107c7 100644 >> --- a/kernel/cpuset.c >> +++ b/kernel/cpuset.c >> @@ -365,13 +365,14 @@ static void cpuset_update_task_spread_flag(struct cpuset *cs, >> struct task_struct *tsk) >> { >> if (is_spread_page(cs)) >> - tsk->flags |= PF_SPREAD_PAGE; >> + task_set_spread_page(tsk); >> else >> - tsk->flags &= ~PF_SPREAD_PAGE; >> + task_clear_spread_page(tsk); >> + >> if (is_spread_slab(cs)) >> - tsk->flags |= PF_SPREAD_SLAB; >> + task_set_spread_slab(tsk); >> else >> - tsk->flags &= ~PF_SPREAD_SLAB; >> + task_clear_spread_slab(tsk); >> } >> >> /* > > This most certainly needs commentary to specify why these have to be > atomic ops. > . It won't hurt to add more comment, but I don't think it's necessary, because the reason to use atomic bitops seems obvious to me. Besides there's no such comment for no_new_privs, which was tsk->atomic_flags introduced for. ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <542124D3.9000007-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH v3 1/3] sched: fix confusing PFA_NO_NEW_PRIVS constant [not found] ` <542124D3.9000007-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> @ 2014-09-24 13:22 ` Tejun Heo [not found] ` <20140924132218.GC16555-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org> 0 siblings, 1 reply; 12+ messages in thread From: Tejun Heo @ 2014-09-24 13:22 UTC (permalink / raw) To: Zefan Li Cc: Peter Zijlstra, Ingo Molnar, Kees Cook, Miao Xie, Tetsuo Handa, LKML, Cgroups, David Rientjes On Tue, Sep 23, 2014 at 03:44:19PM +0800, Zefan Li wrote: > From: Tetsuo Handa <penguin-kernel-JPay3/Yim36HaxMnTkn67Xf5DAMn2ifp@public.gmane.org> > > Commit 1d4457f99928 ("sched: move no_new_privs into new atomic flags") > defined PFA_NO_NEW_PRIVS as hexadecimal value, but it is confusing > because it is used as bit number. Redefine it as decimal bit number. > > Note this changes the bit position of PFA_NOW_NEW_PRIVS from 1 to 0. > > Cc: Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> > Cc: Ingo Molnar <mingo-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> > Cc: Miao Xie <miaox-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org> > Cc: Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> > Signed-off-by: Tetsuo Handa <penguin-kernel-JPay3/Yim36HaxMnTkn67Xf5DAMn2ifp@public.gmane.org> > Acked-by: Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> > [ lizf: slightly modified subject and changelog ] > Signed-off-by: Zefan Li <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> Applied 1-3 to cgroup/for-3.17-fixes w/ the typo spotted by davidr fixed. Thanks. -- tejun ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <20140924132218.GC16555-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>]
* Re: [PATCH v3 1/3] sched: fix confusing PFA_NO_NEW_PRIVS constant [not found] ` <20140924132218.GC16555-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org> @ 2014-09-24 13:35 ` Tejun Heo 2014-09-24 14:09 ` Tetsuo Handa [not found] ` <20140924133513.GD16555-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org> 0 siblings, 2 replies; 12+ messages in thread From: Tejun Heo @ 2014-09-24 13:35 UTC (permalink / raw) To: Zefan Li Cc: Peter Zijlstra, Ingo Molnar, Kees Cook, Miao Xie, Tetsuo Handa, LKML, Cgroups, David Rientjes On Wed, Sep 24, 2014 at 09:22:18AM -0400, Tejun Heo wrote: > On Tue, Sep 23, 2014 at 03:44:19PM +0800, Zefan Li wrote: > > From: Tetsuo Handa <penguin-kernel-JPay3/Yim36HaxMnTkn67Xf5DAMn2ifp@public.gmane.org> > > > > Commit 1d4457f99928 ("sched: move no_new_privs into new atomic flags") > > defined PFA_NO_NEW_PRIVS as hexadecimal value, but it is confusing > > because it is used as bit number. Redefine it as decimal bit number. > > > > Note this changes the bit position of PFA_NOW_NEW_PRIVS from 1 to 0. > > > > Cc: Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> > > Cc: Ingo Molnar <mingo-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> > > Cc: Miao Xie <miaox-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org> > > Cc: Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> > > Signed-off-by: Tetsuo Handa <penguin-kernel-JPay3/Yim36HaxMnTkn67Xf5DAMn2ifp@public.gmane.org> > > Acked-by: Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> > > [ lizf: slightly modified subject and changelog ] > > Signed-off-by: Zefan Li <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> > > Applied 1-3 to cgroup/for-3.17-fixes w/ the typo spotted by davidr > fixed. Reverted due to build failure. Will wait for a refreshed version. :( Thanks. -- tejun ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v3 1/3] sched: fix confusing PFA_NO_NEW_PRIVS constant 2014-09-24 13:35 ` Tejun Heo @ 2014-09-24 14:09 ` Tetsuo Handa [not found] ` <201409242309.HDF95877.OSVFFtQLJMOHOF-JPay3/Yim36HaxMnTkn67Xf5DAMn2ifp@public.gmane.org> [not found] ` <20140924133513.GD16555-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org> 1 sibling, 1 reply; 12+ messages in thread From: Tetsuo Handa @ 2014-09-24 14:09 UTC (permalink / raw) To: tj; +Cc: lizefan, peterz, mingo, keescook, miaox, linux-kernel, cgroups, rientjes Tejun Heo wrote: > > Applied 1-3 to cgroup/for-3.17-fixes w/ the typo spotted by davidr > > fixed. s/happend/happened/ > > Reverted due to build failure. Will wait for a refreshed version. :( I can build with 1-3 patches using HEAD of linux.git. What failure did you see? $ make kernel/seccomp.o kernel/sys.o security/apparmor/domain.o kernel/cpuset.o ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <201409242309.HDF95877.OSVFFtQLJMOHOF-JPay3/Yim36HaxMnTkn67Xf5DAMn2ifp@public.gmane.org>]
* Re: [PATCH v3 1/3] sched: fix confusing PFA_NO_NEW_PRIVS constant [not found] ` <201409242309.HDF95877.OSVFFtQLJMOHOF-JPay3/Yim36HaxMnTkn67Xf5DAMn2ifp@public.gmane.org> @ 2014-09-24 14:32 ` Tejun Heo 0 siblings, 0 replies; 12+ messages in thread From: Tejun Heo @ 2014-09-24 14:32 UTC (permalink / raw) To: Tetsuo Handa Cc: lizefan-hv44wF8Li93QT0dZR+AlfA, peterz-wEGCiKHe2LqWVfeAwA7xHQ, mingo-DgEjT+Ai2ygdnm+yROfE0A, keescook-F7+t8E8rja9g9hUCZPvPmw, miaox-BthXqXjhjHXQFUHtdCDX3A, linux-kernel-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA, rientjes-hpIqsD4AKlfQT0dZR+AlfA On Wed, Sep 24, 2014 at 11:09:25PM +0900, Tetsuo Handa wrote: > Tejun Heo wrote: > > > Applied 1-3 to cgroup/for-3.17-fixes w/ the typo spotted by davidr > > > fixed. > > s/happend/happened/ > > > > > Reverted due to build failure. Will wait for a refreshed version. :( > > I can build with 1-3 patches using HEAD of linux.git. What failure did you see? > > $ make kernel/seccomp.o kernel/sys.o security/apparmor/domain.o kernel/cpuset.o Got the following failure report from the build bot. tree: git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git for-3.17-fixes head: ed91a06989e253cde811d492414adcf94f34eb03 commit: ed91a06989e253cde811d492414adcf94f34eb03 [8/8] cpuset: PF_SPREAD_PAGE and PF_SPREAD_SLAB +should be atomic flags config: x86_64-randconfig-hxb2-0924 (attached as .config) reproduce: git checkout ed91a06989e253cde811d492414adcf94f34eb03 # save the attached .config to linux build tree make ARCH=x86_64 All error/warnings: In file included from include/linux/linkage.h:4:0, from include/linux/preempt.h:9, from include/linux/spinlock.h:50, from include/linux/mmzone.h:7, from include/linux/gfp.h:5, from include/linux/slab.h:14, from mm/slab.c:89: mm/slab.c: In function '__do_cache_alloc': >> mm/slab.c:3229:54: error: 'PF_SPREAD_SLAB' undeclared (first use in this function) if (current->mempolicy || unlikely(current->flags & PF_SPREAD_SLAB)) { ^ include/linux/compiler.h:131:45: note: in definition of macro 'unlikely' # define unlikely(x) (__builtin_constant_p(x) ? !!(x) : __branch_check__(x, 0)) ^ mm/slab.c:3229:54: note: each undeclared identifier is reported only once for each function it ap +pears in if (current->mempolicy || unlikely(current->flags & PF_SPREAD_SLAB)) { ^ include/linux/compiler.h:131:45: note: in definition of macro 'unlikely' # define unlikely(x) (__builtin_constant_p(x) ? !!(x) : __branch_check__(x, 0)) ^ vim +/PF_SPREAD_SLAB +3229 mm/slab.c 8c8cc2c1 Pekka Enberg 2007-02-10 3223 8c8cc2c1 Pekka Enberg 2007-02-10 3224 static __always_inline void * 8c8cc2c1 Pekka Enberg 2007-02-10 3225 __do_cache_alloc(struct kmem_cache *cache, gfp_t flags) 8c8cc2c1 Pekka Enberg 2007-02-10 3226 { 8c8cc2c1 Pekka Enberg 2007-02-10 3227 void *objp; 8c8cc2c1 Pekka Enberg 2007-02-10 3228 f0432d15 David Rientjes 2014-04-07 @3229 if (current->mempolicy || unlikely(current->flags & +PF_SPREAD_SLAB)) { 8c8cc2c1 Pekka Enberg 2007-02-10 3230 objp = alternate_node_alloc(cache, flags); 8c8cc2c1 Pekka Enberg 2007-02-10 3231 if (objp) 8c8cc2c1 Pekka Enberg 2007-02-10 3232 goto out; -- tejun ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <20140924133513.GD16555-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>]
* Re: [PATCH v3 1/3] sched: fix confusing PFA_NO_NEW_PRIVS constant [not found] ` <20140924133513.GD16555-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org> @ 2014-09-25 1:42 ` Zefan Li 0 siblings, 0 replies; 12+ messages in thread From: Zefan Li @ 2014-09-25 1:42 UTC (permalink / raw) To: Tejun Heo Cc: Peter Zijlstra, Ingo Molnar, Kees Cook, Miao Xie, Tetsuo Handa, LKML, Cgroups, David Rientjes On 2014/9/24 21:35, Tejun Heo wrote: > On Wed, Sep 24, 2014 at 09:22:18AM -0400, Tejun Heo wrote: >> On Tue, Sep 23, 2014 at 03:44:19PM +0800, Zefan Li wrote: >>> From: Tetsuo Handa <penguin-kernel-JPay3/Yim36HaxMnTkn67Xf5DAMn2ifp@public.gmane.org> >>> >>> Commit 1d4457f99928 ("sched: move no_new_privs into new atomic flags") >>> defined PFA_NO_NEW_PRIVS as hexadecimal value, but it is confusing >>> because it is used as bit number. Redefine it as decimal bit number. >>> >>> Note this changes the bit position of PFA_NOW_NEW_PRIVS from 1 to 0. >>> >>> Cc: Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> >>> Cc: Ingo Molnar <mingo-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> >>> Cc: Miao Xie <miaox-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org> >>> Cc: Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> >>> Signed-off-by: Tetsuo Handa <penguin-kernel-JPay3/Yim36HaxMnTkn67Xf5DAMn2ifp@public.gmane.org> >>> Acked-by: Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> >>> [ lizf: slightly modified subject and changelog ] >>> Signed-off-by: Zefan Li <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> >> >> Applied 1-3 to cgroup/for-3.17-fixes w/ the typo spotted by davidr >> fixed. > > Reverted due to build failure. Will wait for a refreshed version. :( > Didn't notice mm/slab.c uses PF_SPREAD_SLAB without the cpuset wrapper... ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2014-09-25 1:42 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-09-23 7:44 [PATCH v3 1/3] sched: fix confusing PFA_NO_NEW_PRIVS constant Zefan Li
2014-09-23 7:44 ` [PATCH v3 2/3] sched: add macros to define bitops for task atomic flags Zefan Li
[not found] ` <542124FB.4030109-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2014-09-23 10:01 ` Peter Zijlstra
[not found] ` <20140923100127.GE3312-IIpfhp3q70z/8w/KjCw3T+5/BudmfyzbbVWyRVo5IupeoWH0uzbU5w@public.gmane.org>
2014-09-23 16:58 ` Kees Cook
2014-09-23 7:45 ` [PATCH v3 3/3] cpuset: PF_SPREAD_PAGE and PF_SPREAD_SLAB should be " Zefan Li
2014-09-23 22:10 ` David Rientjes
[not found] ` <alpine.DEB.2.02.1409231508590.22630-X6Q0R45D7oAcqpCFd4KODRPsWskHk0ljAL8bYrjMMd8@public.gmane.org>
2014-09-24 3:15 ` Zefan Li
[not found] ` <542124D3.9000007-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2014-09-24 13:22 ` [PATCH v3 1/3] sched: fix confusing PFA_NO_NEW_PRIVS constant Tejun Heo
[not found] ` <20140924132218.GC16555-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2014-09-24 13:35 ` Tejun Heo
2014-09-24 14:09 ` Tetsuo Handa
[not found] ` <201409242309.HDF95877.OSVFFtQLJMOHOF-JPay3/Yim36HaxMnTkn67Xf5DAMn2ifp@public.gmane.org>
2014-09-24 14:32 ` Tejun Heo
[not found] ` <20140924133513.GD16555-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2014-09-25 1:42 ` Zefan Li
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox