All of lore.kernel.org
 help / color / mirror / Atom feed
From: Zefan Li <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
To: David Rientjes <rientjes-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
Cc: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>,
	Ingo Molnar <mingo-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>,
	Miao Xie <miaox-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>,
	Tetsuo Handa
	<penguin-kernel-1yMVhJb1mP/7nzcFbJAaVXf5DAMn2ifp@public.gmane.org>,
	LKML <linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Cgroups <cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Subject: Re: [PATCH v3 3/3] cpuset: PF_SPREAD_PAGE and PF_SPREAD_SLAB should be atomic flags
Date: Wed, 24 Sep 2014 11:15:36 +0800	[thread overview]
Message-ID: <54223758.4070300@huawei.com> (raw)
In-Reply-To: <alpine.DEB.2.02.1409231508590.22630-X6Q0R45D7oAcqpCFd4KODRPsWskHk0ljAL8bYrjMMd8@public.gmane.org>

于 2014/9/24 6:10, David Rientjes wrote:
> On Tue, 23 Sep 2014, Zefan Li wrote:
> 
>> When we change cpuset.memory_spread_{page,slab}, cpuset will flip
>> PF_SPREAD_{PAGE,SLAB} bit of tsk->flags for each task in that cpuset.
>> This should be done using atomic bitops, but currently we don't,
>> which is broken.
>>
>> Tetsuo reported a hard-to-reproduce kernel crash on RHEL6, which happend
>> when one thread tried to clear PF_USED_MATH while at the same time another
>> thread tried to flip PF_SPREAD_PAGE/PF_SPREAD_SLAB. They both operate on
>> the same task.
>>
>> Here's the full report:
>> https://lkml.org/lkml/2014/9/19/230
>>
>> To fix this, we make PF_SPREAD_PAGE and PF_SPARED_SLAB atomic flags.
>>
> 
> s/SPARED/SPREAD/
> 
>> Cc: Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
>> Cc: Ingo Molnar <mingo-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
>> Cc: Miao Xie <miaox-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
>> Cc: Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
>> Fixes: 950592f7b991 ("cpusets: update tasks' page/slab spread flags in time")
>> Cc: <stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org> # 2.6.31+
>> Reported-by: Tetsuo Handa <penguin-kernel-JPay3/Yim36HaxMnTkn67Xf5DAMn2ifp@public.gmane.org>
>> Signed-off-by: Zefan Li <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
>> ---
>>  include/linux/cpuset.h |  4 ++--
>>  include/linux/sched.h  | 13 +++++++++++--
>>  kernel/cpuset.c        |  9 +++++----
>>  3 files changed, 18 insertions(+), 8 deletions(-)
>>
>> diff --git a/include/linux/cpuset.h b/include/linux/cpuset.h
>> index 0d4e067..2f073db 100644
>> --- a/include/linux/cpuset.h
>> +++ b/include/linux/cpuset.h
>> @@ -94,12 +94,12 @@ extern int cpuset_slab_spread_node(void);
>>  
>>  static inline int cpuset_do_page_mem_spread(void)
>>  {
>> -	return current->flags & PF_SPREAD_PAGE;
>> +	return task_spread_page(current);
>>  }
>>  
>>  static inline int cpuset_do_slab_mem_spread(void)
>>  {
>> -	return current->flags & PF_SPREAD_SLAB;
>> +	return task_spread_slab(current);
>>  }
>>  
>>  extern int current_cpuset_is_being_rebound(void);
>> diff --git a/include/linux/sched.h b/include/linux/sched.h
>> index 5630763..7b1cafe 100644
>> --- a/include/linux/sched.h
>> +++ b/include/linux/sched.h
>> @@ -1903,8 +1903,6 @@ extern void thread_group_cputime_adjusted(struct task_struct *p, cputime_t *ut,
>>  #define PF_KTHREAD	0x00200000	/* I am a kernel thread */
>>  #define PF_RANDOMIZE	0x00400000	/* randomize virtual address space */
>>  #define PF_SWAPWRITE	0x00800000	/* Allowed to write to swap */
>> -#define PF_SPREAD_PAGE	0x01000000	/* Spread page cache over cpuset */
>> -#define PF_SPREAD_SLAB	0x02000000	/* Spread some slab caches over cpuset */
>>  #define PF_NO_SETAFFINITY 0x04000000	/* Userland is not allowed to meddle with cpus_allowed */
>>  #define PF_MCE_EARLY    0x08000000      /* Early kill for mce process policy */
>>  #define PF_MUTEX_TESTER	0x20000000	/* Thread belongs to the rt mutex tester */
>> @@ -1958,6 +1956,9 @@ static inline void memalloc_noio_restore(unsigned int flags)
>>  
>>  /* Per-process atomic flags. */
>>  #define PFA_NO_NEW_PRIVS 0	/* May not gain new privileges. */
>> +#define PFA_SPREAD_PAGE  1      /* Spread page cache over cpuset */
>> +#define PFA_SPREAD_SLAB  2      /* Spread some slab caches over cpuset */
>> +
>>  
>>  #define TASK_PFA_TEST(name, func)					\
>>  	static inline bool task_##func(struct task_struct *p)		\
>> @@ -1972,6 +1973,14 @@ static inline void memalloc_noio_restore(unsigned int flags)
>>  TASK_PFA_TEST(NO_NEW_PRIVS, no_new_privs)
>>  TASK_PFA_SET(NO_NEW_PRIVS, no_new_privs)
>>  
>> +TASK_PFA_TEST(SPREAD_PAGE, spread_page)
>> +TASK_PFA_SET(SPREAD_PAGE, spread_page)
>> +TASK_PFA_CLEAR(SPREAD_PAGE, spread_page)
>> +
>> +TASK_PFA_TEST(SPREAD_SLAB, spread_slab)
>> +TASK_PFA_SET(SPREAD_SLAB, spread_slab)
>> +TASK_PFA_CLEAR(SPREAD_SLAB, spread_slab)
>> +
>>  /*
>>   * task->jobctl flags
>>   */
>> diff --git a/kernel/cpuset.c b/kernel/cpuset.c
>> index a37f4ed..1f107c7 100644
>> --- a/kernel/cpuset.c
>> +++ b/kernel/cpuset.c
>> @@ -365,13 +365,14 @@ static void cpuset_update_task_spread_flag(struct cpuset *cs,
>>  					struct task_struct *tsk)
>>  {
>>  	if (is_spread_page(cs))
>> -		tsk->flags |= PF_SPREAD_PAGE;
>> +		task_set_spread_page(tsk);
>>  	else
>> -		tsk->flags &= ~PF_SPREAD_PAGE;
>> +		task_clear_spread_page(tsk);
>> +
>>  	if (is_spread_slab(cs))
>> -		tsk->flags |= PF_SPREAD_SLAB;
>> +		task_set_spread_slab(tsk);
>>  	else
>> -		tsk->flags &= ~PF_SPREAD_SLAB;
>> +		task_clear_spread_slab(tsk);
>>  }
>>  
>>  /*
> 
> This most certainly needs commentary to specify why these have to be 
> atomic ops.
> .

It won't hurt to add more comment, but I don't think it's necessary, because
the reason to use atomic bitops seems obvious to me. Besides there's no such
comment for no_new_privs, which was tsk->atomic_flags introduced for.


WARNING: multiple messages have this Message-ID (diff)
From: Zefan Li <lizefan@huawei.com>
To: David Rientjes <rientjes@google.com>
Cc: Tejun Heo <tj@kernel.org>, Peter Zijlstra <peterz@infradead.org>,
	"Ingo Molnar" <mingo@kernel.org>,
	Kees Cook <keescook@chromium.org>,
	Miao Xie <miaox@cn.fujitsu.com>,
	Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>,
	LKML <linux-kernel@vger.kernel.org>,
	Cgroups <cgroups@vger.kernel.org>
Subject: Re: [PATCH v3 3/3] cpuset: PF_SPREAD_PAGE and PF_SPREAD_SLAB should be atomic flags
Date: Wed, 24 Sep 2014 11:15:36 +0800	[thread overview]
Message-ID: <54223758.4070300@huawei.com> (raw)
In-Reply-To: <alpine.DEB.2.02.1409231508590.22630@chino.kir.corp.google.com>

于 2014/9/24 6:10, David Rientjes wrote:
> On Tue, 23 Sep 2014, Zefan Li wrote:
> 
>> When we change cpuset.memory_spread_{page,slab}, cpuset will flip
>> PF_SPREAD_{PAGE,SLAB} bit of tsk->flags for each task in that cpuset.
>> This should be done using atomic bitops, but currently we don't,
>> which is broken.
>>
>> Tetsuo reported a hard-to-reproduce kernel crash on RHEL6, which happend
>> when one thread tried to clear PF_USED_MATH while at the same time another
>> thread tried to flip PF_SPREAD_PAGE/PF_SPREAD_SLAB. They both operate on
>> the same task.
>>
>> Here's the full report:
>> https://lkml.org/lkml/2014/9/19/230
>>
>> To fix this, we make PF_SPREAD_PAGE and PF_SPARED_SLAB atomic flags.
>>
> 
> s/SPARED/SPREAD/
> 
>> Cc: Peter Zijlstra <peterz@infradead.org>
>> Cc: Ingo Molnar <mingo@kernel.org>
>> Cc: Miao Xie <miaox@cn.fujitsu.com>
>> Cc: Kees Cook <keescook@chromium.org>
>> Fixes: 950592f7b991 ("cpusets: update tasks' page/slab spread flags in time")
>> Cc: <stable@vger.kernel.org> # 2.6.31+
>> Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
>> Signed-off-by: Zefan Li <lizefan@huawei.com>
>> ---
>>  include/linux/cpuset.h |  4 ++--
>>  include/linux/sched.h  | 13 +++++++++++--
>>  kernel/cpuset.c        |  9 +++++----
>>  3 files changed, 18 insertions(+), 8 deletions(-)
>>
>> diff --git a/include/linux/cpuset.h b/include/linux/cpuset.h
>> index 0d4e067..2f073db 100644
>> --- a/include/linux/cpuset.h
>> +++ b/include/linux/cpuset.h
>> @@ -94,12 +94,12 @@ extern int cpuset_slab_spread_node(void);
>>  
>>  static inline int cpuset_do_page_mem_spread(void)
>>  {
>> -	return current->flags & PF_SPREAD_PAGE;
>> +	return task_spread_page(current);
>>  }
>>  
>>  static inline int cpuset_do_slab_mem_spread(void)
>>  {
>> -	return current->flags & PF_SPREAD_SLAB;
>> +	return task_spread_slab(current);
>>  }
>>  
>>  extern int current_cpuset_is_being_rebound(void);
>> diff --git a/include/linux/sched.h b/include/linux/sched.h
>> index 5630763..7b1cafe 100644
>> --- a/include/linux/sched.h
>> +++ b/include/linux/sched.h
>> @@ -1903,8 +1903,6 @@ extern void thread_group_cputime_adjusted(struct task_struct *p, cputime_t *ut,
>>  #define PF_KTHREAD	0x00200000	/* I am a kernel thread */
>>  #define PF_RANDOMIZE	0x00400000	/* randomize virtual address space */
>>  #define PF_SWAPWRITE	0x00800000	/* Allowed to write to swap */
>> -#define PF_SPREAD_PAGE	0x01000000	/* Spread page cache over cpuset */
>> -#define PF_SPREAD_SLAB	0x02000000	/* Spread some slab caches over cpuset */
>>  #define PF_NO_SETAFFINITY 0x04000000	/* Userland is not allowed to meddle with cpus_allowed */
>>  #define PF_MCE_EARLY    0x08000000      /* Early kill for mce process policy */
>>  #define PF_MUTEX_TESTER	0x20000000	/* Thread belongs to the rt mutex tester */
>> @@ -1958,6 +1956,9 @@ static inline void memalloc_noio_restore(unsigned int flags)
>>  
>>  /* Per-process atomic flags. */
>>  #define PFA_NO_NEW_PRIVS 0	/* May not gain new privileges. */
>> +#define PFA_SPREAD_PAGE  1      /* Spread page cache over cpuset */
>> +#define PFA_SPREAD_SLAB  2      /* Spread some slab caches over cpuset */
>> +
>>  
>>  #define TASK_PFA_TEST(name, func)					\
>>  	static inline bool task_##func(struct task_struct *p)		\
>> @@ -1972,6 +1973,14 @@ static inline void memalloc_noio_restore(unsigned int flags)
>>  TASK_PFA_TEST(NO_NEW_PRIVS, no_new_privs)
>>  TASK_PFA_SET(NO_NEW_PRIVS, no_new_privs)
>>  
>> +TASK_PFA_TEST(SPREAD_PAGE, spread_page)
>> +TASK_PFA_SET(SPREAD_PAGE, spread_page)
>> +TASK_PFA_CLEAR(SPREAD_PAGE, spread_page)
>> +
>> +TASK_PFA_TEST(SPREAD_SLAB, spread_slab)
>> +TASK_PFA_SET(SPREAD_SLAB, spread_slab)
>> +TASK_PFA_CLEAR(SPREAD_SLAB, spread_slab)
>> +
>>  /*
>>   * task->jobctl flags
>>   */
>> diff --git a/kernel/cpuset.c b/kernel/cpuset.c
>> index a37f4ed..1f107c7 100644
>> --- a/kernel/cpuset.c
>> +++ b/kernel/cpuset.c
>> @@ -365,13 +365,14 @@ static void cpuset_update_task_spread_flag(struct cpuset *cs,
>>  					struct task_struct *tsk)
>>  {
>>  	if (is_spread_page(cs))
>> -		tsk->flags |= PF_SPREAD_PAGE;
>> +		task_set_spread_page(tsk);
>>  	else
>> -		tsk->flags &= ~PF_SPREAD_PAGE;
>> +		task_clear_spread_page(tsk);
>> +
>>  	if (is_spread_slab(cs))
>> -		tsk->flags |= PF_SPREAD_SLAB;
>> +		task_set_spread_slab(tsk);
>>  	else
>> -		tsk->flags &= ~PF_SPREAD_SLAB;
>> +		task_clear_spread_slab(tsk);
>>  }
>>  
>>  /*
> 
> This most certainly needs commentary to specify why these have to be 
> atomic ops.
> .

It won't hurt to add more comment, but I don't think it's necessary, because
the reason to use atomic bitops seems obvious to me. Besides there's no such
comment for no_new_privs, which was tsk->atomic_flags introduced for.



  parent reply	other threads:[~2014-09-24  3:15 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-23  7:44 [PATCH v3 1/3] sched: fix confusing PFA_NO_NEW_PRIVS constant Zefan Li
2014-09-23  7:44 ` [PATCH v3 2/3] sched: add macros to define bitops for task atomic flags Zefan Li
     [not found]   ` <542124FB.4030109-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2014-09-23 10:01     ` Peter Zijlstra
2014-09-23 10:01       ` Peter Zijlstra
     [not found]       ` <20140923100127.GE3312-IIpfhp3q70z/8w/KjCw3T+5/BudmfyzbbVWyRVo5IupeoWH0uzbU5w@public.gmane.org>
2014-09-23 16:58         ` Kees Cook
2014-09-23 16:58           ` Kees Cook
2014-09-23  7:45 ` [PATCH v3 3/3] cpuset: PF_SPREAD_PAGE and PF_SPREAD_SLAB should be " Zefan Li
2014-09-23 22:10   ` David Rientjes
     [not found]     ` <alpine.DEB.2.02.1409231508590.22630-X6Q0R45D7oAcqpCFd4KODRPsWskHk0ljAL8bYrjMMd8@public.gmane.org>
2014-09-24  3:15       ` Zefan Li [this message]
2014-09-24  3:15         ` Zefan Li
     [not found] ` <542124D3.9000007-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2014-09-24 13:22   ` [PATCH v3 1/3] sched: fix confusing PFA_NO_NEW_PRIVS constant Tejun Heo
2014-09-24 13:22     ` Tejun Heo
     [not found]     ` <20140924132218.GC16555-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2014-09-24 13:35       ` Tejun Heo
2014-09-24 13:35         ` Tejun Heo
2014-09-24 14:09         ` Tetsuo Handa
     [not found]           ` <201409242309.HDF95877.OSVFFtQLJMOHOF-JPay3/Yim36HaxMnTkn67Xf5DAMn2ifp@public.gmane.org>
2014-09-24 14:32             ` Tejun Heo
2014-09-24 14:32               ` Tejun Heo
     [not found]         ` <20140924133513.GD16555-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2014-09-25  1:42           ` Zefan Li
2014-09-25  1:42             ` Zefan Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54223758.4070300@huawei.com \
    --to=lizefan-hv44wf8li93qt0dzr+alfa@public.gmane.org \
    --cc=cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=miaox-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org \
    --cc=mingo-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    --cc=penguin-kernel-1yMVhJb1mP/7nzcFbJAaVXf5DAMn2ifp@public.gmane.org \
    --cc=peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org \
    --cc=rientjes-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
    --cc=tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.