Generic Linux architectural discussions
 help / color / mirror / Atom feed
* Re: [PATCH] memcg: remove unneeded preempt_disable
       [not found] <1313650253-21794-1-git-send-email-gthelen@google.com>
@ 2011-08-18 21:40 ` Andrew Morton
  2011-08-18 21:40   ` Andrew Morton
                     ` (2 more replies)
  0 siblings, 3 replies; 27+ messages in thread
From: Andrew Morton @ 2011-08-18 21:40 UTC (permalink / raw)
  To: Greg Thelen
  Cc: linux-kernel, linux-mm, KAMEZAWA Hiroyuki, Balbir Singh,
	Daisuke Nishimura, linux-arch

(cc linux-arch)

On Wed, 17 Aug 2011 23:50:53 -0700
Greg Thelen <gthelen@google.com> wrote:

> Both mem_cgroup_charge_statistics() and mem_cgroup_move_account() were
> unnecessarily disabling preemption when adjusting per-cpu counters:
>     preempt_disable()
>     __this_cpu_xxx()
>     __this_cpu_yyy()
>     preempt_enable()
> 
> This change does not disable preemption and thus CPU switch is possible
> within these routines.  This does not cause a problem because the total
> of all cpu counters is summed when reporting stats.  Now both
> mem_cgroup_charge_statistics() and mem_cgroup_move_account() look like:
>     this_cpu_xxx()
>     this_cpu_yyy()
> 
> ...
>
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -664,24 +664,20 @@ static unsigned long mem_cgroup_read_events(struct mem_cgroup *mem,
>  static void mem_cgroup_charge_statistics(struct mem_cgroup *mem,
>  					 bool file, int nr_pages)
>  {
> -	preempt_disable();
> -
>  	if (file)
> -		__this_cpu_add(mem->stat->count[MEM_CGROUP_STAT_CACHE], nr_pages);
> +		this_cpu_add(mem->stat->count[MEM_CGROUP_STAT_CACHE], nr_pages);
>  	else
> -		__this_cpu_add(mem->stat->count[MEM_CGROUP_STAT_RSS], nr_pages);
> +		this_cpu_add(mem->stat->count[MEM_CGROUP_STAT_RSS], nr_pages);
>  
>  	/* pagein of a big page is an event. So, ignore page size */
>  	if (nr_pages > 0)
> -		__this_cpu_inc(mem->stat->events[MEM_CGROUP_EVENTS_PGPGIN]);
> +		this_cpu_inc(mem->stat->events[MEM_CGROUP_EVENTS_PGPGIN]);
>  	else {
> -		__this_cpu_inc(mem->stat->events[MEM_CGROUP_EVENTS_PGPGOUT]);
> +		this_cpu_inc(mem->stat->events[MEM_CGROUP_EVENTS_PGPGOUT]);
>  		nr_pages = -nr_pages; /* for event */
>  	}
>  
> -	__this_cpu_add(mem->stat->events[MEM_CGROUP_EVENTS_COUNT], nr_pages);
> -
> -	preempt_enable();
> +	this_cpu_add(mem->stat->events[MEM_CGROUP_EVENTS_COUNT], nr_pages);
>  }

On non-x86 architectures this_cpu_add() internally does
preempt_disable() and preempt_enable().  So the patch is a small
optimisation for x86 and a larger deoptimisation for non-x86.

I think I'll apply it, as the call frequency is low (correct?) and the
problem will correct itself as other architectures implement their
atomic this_cpu_foo() operations.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] memcg: remove unneeded preempt_disable
  2011-08-18 21:40 ` [PATCH] memcg: remove unneeded preempt_disable Andrew Morton
@ 2011-08-18 21:40   ` Andrew Morton
  2011-08-19  0:00   ` Greg Thelen
  2011-08-25 14:57   ` Peter Zijlstra
  2 siblings, 0 replies; 27+ messages in thread
From: Andrew Morton @ 2011-08-18 21:40 UTC (permalink / raw)
  To: Greg Thelen
  Cc: linux-kernel, linux-mm, KAMEZAWA Hiroyuki, Balbir Singh,
	Daisuke Nishimura, linux-arch

(cc linux-arch)

On Wed, 17 Aug 2011 23:50:53 -0700
Greg Thelen <gthelen@google.com> wrote:

> Both mem_cgroup_charge_statistics() and mem_cgroup_move_account() were
> unnecessarily disabling preemption when adjusting per-cpu counters:
>     preempt_disable()
>     __this_cpu_xxx()
>     __this_cpu_yyy()
>     preempt_enable()
> 
> This change does not disable preemption and thus CPU switch is possible
> within these routines.  This does not cause a problem because the total
> of all cpu counters is summed when reporting stats.  Now both
> mem_cgroup_charge_statistics() and mem_cgroup_move_account() look like:
>     this_cpu_xxx()
>     this_cpu_yyy()
> 
> ...
>
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -664,24 +664,20 @@ static unsigned long mem_cgroup_read_events(struct mem_cgroup *mem,
>  static void mem_cgroup_charge_statistics(struct mem_cgroup *mem,
>  					 bool file, int nr_pages)
>  {
> -	preempt_disable();
> -
>  	if (file)
> -		__this_cpu_add(mem->stat->count[MEM_CGROUP_STAT_CACHE], nr_pages);
> +		this_cpu_add(mem->stat->count[MEM_CGROUP_STAT_CACHE], nr_pages);
>  	else
> -		__this_cpu_add(mem->stat->count[MEM_CGROUP_STAT_RSS], nr_pages);
> +		this_cpu_add(mem->stat->count[MEM_CGROUP_STAT_RSS], nr_pages);
>  
>  	/* pagein of a big page is an event. So, ignore page size */
>  	if (nr_pages > 0)
> -		__this_cpu_inc(mem->stat->events[MEM_CGROUP_EVENTS_PGPGIN]);
> +		this_cpu_inc(mem->stat->events[MEM_CGROUP_EVENTS_PGPGIN]);
>  	else {
> -		__this_cpu_inc(mem->stat->events[MEM_CGROUP_EVENTS_PGPGOUT]);
> +		this_cpu_inc(mem->stat->events[MEM_CGROUP_EVENTS_PGPGOUT]);
>  		nr_pages = -nr_pages; /* for event */
>  	}
>  
> -	__this_cpu_add(mem->stat->events[MEM_CGROUP_EVENTS_COUNT], nr_pages);
> -
> -	preempt_enable();
> +	this_cpu_add(mem->stat->events[MEM_CGROUP_EVENTS_COUNT], nr_pages);
>  }

On non-x86 architectures this_cpu_add() internally does
preempt_disable() and preempt_enable().  So the patch is a small
optimisation for x86 and a larger deoptimisation for non-x86.

I think I'll apply it, as the call frequency is low (correct?) and the
problem will correct itself as other architectures implement their
atomic this_cpu_foo() operations.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] memcg: remove unneeded preempt_disable
  2011-08-18 21:40 ` [PATCH] memcg: remove unneeded preempt_disable Andrew Morton
  2011-08-18 21:40   ` Andrew Morton
@ 2011-08-19  0:00   ` Greg Thelen
  2011-08-25 14:57   ` Peter Zijlstra
  2 siblings, 0 replies; 27+ messages in thread
From: Greg Thelen @ 2011-08-19  0:00 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-mm, KAMEZAWA Hiroyuki, Balbir Singh,
	Daisuke Nishimura, linux-arch, Valdis.Kletnieks, jweiner

On Thu, Aug 18, 2011 at 2:40 PM, Andrew Morton
<akpm@linux-foundation.org> wrote:
> (cc linux-arch)
>
> On Wed, 17 Aug 2011 23:50:53 -0700
> Greg Thelen <gthelen@google.com> wrote:
>
>> Both mem_cgroup_charge_statistics() and mem_cgroup_move_account() were
>> unnecessarily disabling preemption when adjusting per-cpu counters:
>>     preempt_disable()
>>     __this_cpu_xxx()
>>     __this_cpu_yyy()
>>     preempt_enable()
>>
>> This change does not disable preemption and thus CPU switch is possible
>> within these routines.  This does not cause a problem because the total
>> of all cpu counters is summed when reporting stats.  Now both
>> mem_cgroup_charge_statistics() and mem_cgroup_move_account() look like:
>>     this_cpu_xxx()
>>     this_cpu_yyy()
>>
>> ...
>>
>> --- a/mm/memcontrol.c
>> +++ b/mm/memcontrol.c
>> @@ -664,24 +664,20 @@ static unsigned long mem_cgroup_read_events(struct mem_cgroup *mem,
>>  static void mem_cgroup_charge_statistics(struct mem_cgroup *mem,
>>                                        bool file, int nr_pages)
>>  {
>> -     preempt_disable();
>> -
>>       if (file)
>> -             __this_cpu_add(mem->stat->count[MEM_CGROUP_STAT_CACHE], nr_pages);
>> +             this_cpu_add(mem->stat->count[MEM_CGROUP_STAT_CACHE], nr_pages);
>>       else
>> -             __this_cpu_add(mem->stat->count[MEM_CGROUP_STAT_RSS], nr_pages);
>> +             this_cpu_add(mem->stat->count[MEM_CGROUP_STAT_RSS], nr_pages);
>>
>>       /* pagein of a big page is an event. So, ignore page size */
>>       if (nr_pages > 0)
>> -             __this_cpu_inc(mem->stat->events[MEM_CGROUP_EVENTS_PGPGIN]);
>> +             this_cpu_inc(mem->stat->events[MEM_CGROUP_EVENTS_PGPGIN]);
>>       else {
>> -             __this_cpu_inc(mem->stat->events[MEM_CGROUP_EVENTS_PGPGOUT]);
>> +             this_cpu_inc(mem->stat->events[MEM_CGROUP_EVENTS_PGPGOUT]);
>>               nr_pages = -nr_pages; /* for event */
>>       }
>>
>> -     __this_cpu_add(mem->stat->events[MEM_CGROUP_EVENTS_COUNT], nr_pages);
>> -
>> -     preempt_enable();
>> +     this_cpu_add(mem->stat->events[MEM_CGROUP_EVENTS_COUNT], nr_pages);
>>  }
>
> On non-x86 architectures this_cpu_add() internally does
> preempt_disable() and preempt_enable().  So the patch is a small
> optimisation for x86 and a larger deoptimisation for non-x86.
>
> I think I'll apply it, as the call frequency is low (correct?) and the
> problem will correct itself as other architectures implement their
> atomic this_cpu_foo() operations.

mem_cgroup_charge_statistics() is a common operation, which is called
on each memcg page charge and uncharge.

The per arch/config effects of this patch:
* non-preemptible kernels: there's no difference before/after this patch.
* preemptible x86: this patch helps by removing an unnecessary
preempt_disable/enable.
* preemptible non-x86: this patch hurts by adding implicit
preempt_disable/enable around each operation.

So I am uncomfortable this patch's unmeasured impact on archs that do
not have atomic this_cpu_foo() operations.  Please drop the patch from
mmotm.  Sorry for the noise.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] memcg: remove unneeded preempt_disable
  2011-08-18 21:40 ` [PATCH] memcg: remove unneeded preempt_disable Andrew Morton
  2011-08-18 21:40   ` Andrew Morton
  2011-08-19  0:00   ` Greg Thelen
@ 2011-08-25 14:57   ` Peter Zijlstra
  2011-08-25 15:11     ` Christoph Lameter
  2 siblings, 1 reply; 27+ messages in thread
From: Peter Zijlstra @ 2011-08-25 14:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Greg Thelen, linux-kernel, linux-mm, KAMEZAWA Hiroyuki,
	Balbir Singh, Daisuke Nishimura, linux-arch

On Thu, 2011-08-18 at 14:40 -0700, Andrew Morton wrote:
> 
> I think I'll apply it, as the call frequency is low (correct?) and the
> problem will correct itself as other architectures implement their
> atomic this_cpu_foo() operations. 

Which leads me to wonder, can anything but x86 implement that this_cpu_*
muck? I doubt any of the risk chips can actually do all this.
Maybe Itanic, but then that seems to be dying fast.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] memcg: remove unneeded preempt_disable
  2011-08-25 14:57   ` Peter Zijlstra
@ 2011-08-25 15:11     ` Christoph Lameter
  2011-08-25 15:11       ` Christoph Lameter
  2011-08-25 16:20       ` James Bottomley
  0 siblings, 2 replies; 27+ messages in thread
From: Christoph Lameter @ 2011-08-25 15:11 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Andrew Morton, Greg Thelen, linux-kernel, linux-mm,
	KAMEZAWA Hiroyuki, Balbir Singh, Daisuke Nishimura, linux-arch

On Thu, 25 Aug 2011, Peter Zijlstra wrote:

> On Thu, 2011-08-18 at 14:40 -0700, Andrew Morton wrote:
> >
> > I think I'll apply it, as the call frequency is low (correct?) and the
> > problem will correct itself as other architectures implement their
> > atomic this_cpu_foo() operations.
>
> Which leads me to wonder, can anything but x86 implement that this_cpu_*
> muck? I doubt any of the risk chips can actually do all this.
> Maybe Itanic, but then that seems to be dying fast.

The cpu needs to have an RMW instruction that does something to a
variable relative to a register that points to the per cpu base.

Thats generally possible. The problem is how expensive the RMW is going to
be.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] memcg: remove unneeded preempt_disable
  2011-08-25 15:11     ` Christoph Lameter
@ 2011-08-25 15:11       ` Christoph Lameter
  2011-08-25 16:20       ` James Bottomley
  1 sibling, 0 replies; 27+ messages in thread
From: Christoph Lameter @ 2011-08-25 15:11 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Andrew Morton, Greg Thelen, linux-kernel, linux-mm,
	KAMEZAWA Hiroyuki, Balbir Singh, Daisuke Nishimura, linux-arch

On Thu, 25 Aug 2011, Peter Zijlstra wrote:

> On Thu, 2011-08-18 at 14:40 -0700, Andrew Morton wrote:
> >
> > I think I'll apply it, as the call frequency is low (correct?) and the
> > problem will correct itself as other architectures implement their
> > atomic this_cpu_foo() operations.
>
> Which leads me to wonder, can anything but x86 implement that this_cpu_*
> muck? I doubt any of the risk chips can actually do all this.
> Maybe Itanic, but then that seems to be dying fast.

The cpu needs to have an RMW instruction that does something to a
variable relative to a register that points to the per cpu base.

Thats generally possible. The problem is how expensive the RMW is going to
be.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] memcg: remove unneeded preempt_disable
  2011-08-25 15:11     ` Christoph Lameter
  2011-08-25 15:11       ` Christoph Lameter
@ 2011-08-25 16:20       ` James Bottomley
  2011-08-25 16:31         ` Christoph Lameter
  1 sibling, 1 reply; 27+ messages in thread
From: James Bottomley @ 2011-08-25 16:20 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Peter Zijlstra, Andrew Morton, Greg Thelen, linux-kernel,
	linux-mm, KAMEZAWA Hiroyuki, Balbir Singh, Daisuke Nishimura,
	linux-arch

On Thu, 2011-08-25 at 10:11 -0500, Christoph Lameter wrote:
> On Thu, 25 Aug 2011, Peter Zijlstra wrote:
> 
> > On Thu, 2011-08-18 at 14:40 -0700, Andrew Morton wrote:
> > >
> > > I think I'll apply it, as the call frequency is low (correct?) and the
> > > problem will correct itself as other architectures implement their
> > > atomic this_cpu_foo() operations.
> >
> > Which leads me to wonder, can anything but x86 implement that this_cpu_*
> > muck? I doubt any of the risk chips can actually do all this.
> > Maybe Itanic, but then that seems to be dying fast.
> 
> The cpu needs to have an RMW instruction that does something to a
> variable relative to a register that points to the per cpu base.
> 
> Thats generally possible. The problem is how expensive the RMW is going to
> be.

Risc systems generally don't have a single instruction for this, that's
correct.  Obviously we can do it as a non atomic sequence: read
variable, compute relative, read, modify, write ... but there's
absolutely no point hand crafting that in asm since the compiler can
usually work it out nicely.  And, of course, to have this atomic, we
have to use locks, which ends up being very expensive.

James

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] memcg: remove unneeded preempt_disable
  2011-08-25 16:20       ` James Bottomley
@ 2011-08-25 16:31         ` Christoph Lameter
  2011-08-25 16:31           ` Christoph Lameter
                             ` (2 more replies)
  0 siblings, 3 replies; 27+ messages in thread
From: Christoph Lameter @ 2011-08-25 16:31 UTC (permalink / raw)
  To: James Bottomley
  Cc: Peter Zijlstra, Andrew Morton, Greg Thelen, linux-kernel,
	linux-mm, KAMEZAWA Hiroyuki, Balbir Singh, Daisuke Nishimura,
	linux-arch

On Thu, 25 Aug 2011, James Bottomley wrote:

> On Thu, 2011-08-25 at 10:11 -0500, Christoph Lameter wrote:
> > On Thu, 25 Aug 2011, Peter Zijlstra wrote:
> >
> > > On Thu, 2011-08-18 at 14:40 -0700, Andrew Morton wrote:
> > > >
> > > > I think I'll apply it, as the call frequency is low (correct?) and the
> > > > problem will correct itself as other architectures implement their
> > > > atomic this_cpu_foo() operations.
> > >
> > > Which leads me to wonder, can anything but x86 implement that this_cpu_*
> > > muck? I doubt any of the risk chips can actually do all this.
> > > Maybe Itanic, but then that seems to be dying fast.
> >
> > The cpu needs to have an RMW instruction that does something to a
> > variable relative to a register that points to the per cpu base.
> >
> > Thats generally possible. The problem is how expensive the RMW is going to
> > be.
>
> Risc systems generally don't have a single instruction for this, that's
> correct.  Obviously we can do it as a non atomic sequence: read
> variable, compute relative, read, modify, write ... but there's
> absolutely no point hand crafting that in asm since the compiler can
> usually work it out nicely.  And, of course, to have this atomic, we
> have to use locks, which ends up being very expensive.

ARM seems to have these LDREX/STREX instructions for that purpose which
seem to be used for generating atomic instructions without lockes. I guess
other RISC architectures have similar means of doing it?




--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] memcg: remove unneeded preempt_disable
  2011-08-25 16:31         ` Christoph Lameter
@ 2011-08-25 16:31           ` Christoph Lameter
  2011-08-25 16:34           ` James Bottomley
  2011-08-25 18:40           ` Peter Zijlstra
  2 siblings, 0 replies; 27+ messages in thread
From: Christoph Lameter @ 2011-08-25 16:31 UTC (permalink / raw)
  To: James Bottomley
  Cc: Peter Zijlstra, Andrew Morton, Greg Thelen, linux-kernel,
	linux-mm, KAMEZAWA Hiroyuki, Balbir Singh, Daisuke Nishimura,
	linux-arch

On Thu, 25 Aug 2011, James Bottomley wrote:

> On Thu, 2011-08-25 at 10:11 -0500, Christoph Lameter wrote:
> > On Thu, 25 Aug 2011, Peter Zijlstra wrote:
> >
> > > On Thu, 2011-08-18 at 14:40 -0700, Andrew Morton wrote:
> > > >
> > > > I think I'll apply it, as the call frequency is low (correct?) and the
> > > > problem will correct itself as other architectures implement their
> > > > atomic this_cpu_foo() operations.
> > >
> > > Which leads me to wonder, can anything but x86 implement that this_cpu_*
> > > muck? I doubt any of the risk chips can actually do all this.
> > > Maybe Itanic, but then that seems to be dying fast.
> >
> > The cpu needs to have an RMW instruction that does something to a
> > variable relative to a register that points to the per cpu base.
> >
> > Thats generally possible. The problem is how expensive the RMW is going to
> > be.
>
> Risc systems generally don't have a single instruction for this, that's
> correct.  Obviously we can do it as a non atomic sequence: read
> variable, compute relative, read, modify, write ... but there's
> absolutely no point hand crafting that in asm since the compiler can
> usually work it out nicely.  And, of course, to have this atomic, we
> have to use locks, which ends up being very expensive.

ARM seems to have these LDREX/STREX instructions for that purpose which
seem to be used for generating atomic instructions without lockes. I guess
other RISC architectures have similar means of doing it?





^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] memcg: remove unneeded preempt_disable
  2011-08-25 16:31         ` Christoph Lameter
  2011-08-25 16:31           ` Christoph Lameter
@ 2011-08-25 16:34           ` James Bottomley
  2011-08-25 16:34             ` James Bottomley
  2011-08-25 17:07             ` Christoph Lameter
  2011-08-25 18:40           ` Peter Zijlstra
  2 siblings, 2 replies; 27+ messages in thread
From: James Bottomley @ 2011-08-25 16:34 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Peter Zijlstra, Andrew Morton, Greg Thelen, linux-kernel,
	linux-mm, KAMEZAWA Hiroyuki, Balbir Singh, Daisuke Nishimura,
	linux-arch



Christoph Lameter <cl@linux.com> wrote:

>On Thu, 25 Aug 2011, James Bottomley wrote:
>
>> On Thu, 2011-08-25 at 10:11 -0500, Christoph Lameter wrote:
>> > On Thu, 25 Aug 2011, Peter Zijlstra wrote:
>> >
>> > > On Thu, 2011-08-18 at 14:40 -0700, Andrew Morton wrote:
>> > > >
>> > > > I think I'll apply it, as the call frequency is low (correct?)
>and the
>> > > > problem will correct itself as other architectures implement
>their
>> > > > atomic this_cpu_foo() operations.
>> > >
>> > > Which leads me to wonder, can anything but x86 implement that
>this_cpu_*
>> > > muck? I doubt any of the risk chips can actually do all this.
>> > > Maybe Itanic, but then that seems to be dying fast.
>> >
>> > The cpu needs to have an RMW instruction that does something to a
>> > variable relative to a register that points to the per cpu base.
>> >
>> > Thats generally possible. The problem is how expensive the RMW is
>going to
>> > be.
>>
>> Risc systems generally don't have a single instruction for this,
>that's
>> correct.  Obviously we can do it as a non atomic sequence: read
>> variable, compute relative, read, modify, write ... but there's
>> absolutely no point hand crafting that in asm since the compiler can
>> usually work it out nicely.  And, of course, to have this atomic, we
>> have to use locks, which ends up being very expensive.
>
>ARM seems to have these LDREX/STREX instructions for that purpose which
>seem to be used for generating atomic instructions without lockes. I
>guess
>other RISC architectures have similar means of doing it?

Arm isn't really risc.  Most don't.  However even with ldrex/strex you need two instructions for rmw.

James
-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity and top posting.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] memcg: remove unneeded preempt_disable
  2011-08-25 16:34           ` James Bottomley
@ 2011-08-25 16:34             ` James Bottomley
  2011-08-25 17:07             ` Christoph Lameter
  1 sibling, 0 replies; 27+ messages in thread
From: James Bottomley @ 2011-08-25 16:34 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Peter Zijlstra, Andrew Morton, Greg Thelen, linux-kernel,
	linux-mm, KAMEZAWA Hiroyuki, Balbir Singh, Daisuke Nishimura,
	linux-arch



Christoph Lameter <cl@linux.com> wrote:

>On Thu, 25 Aug 2011, James Bottomley wrote:
>
>> On Thu, 2011-08-25 at 10:11 -0500, Christoph Lameter wrote:
>> > On Thu, 25 Aug 2011, Peter Zijlstra wrote:
>> >
>> > > On Thu, 2011-08-18 at 14:40 -0700, Andrew Morton wrote:
>> > > >
>> > > > I think I'll apply it, as the call frequency is low (correct?)
>and the
>> > > > problem will correct itself as other architectures implement
>their
>> > > > atomic this_cpu_foo() operations.
>> > >
>> > > Which leads me to wonder, can anything but x86 implement that
>this_cpu_*
>> > > muck? I doubt any of the risk chips can actually do all this.
>> > > Maybe Itanic, but then that seems to be dying fast.
>> >
>> > The cpu needs to have an RMW instruction that does something to a
>> > variable relative to a register that points to the per cpu base.
>> >
>> > Thats generally possible. The problem is how expensive the RMW is
>going to
>> > be.
>>
>> Risc systems generally don't have a single instruction for this,
>that's
>> correct.  Obviously we can do it as a non atomic sequence: read
>> variable, compute relative, read, modify, write ... but there's
>> absolutely no point hand crafting that in asm since the compiler can
>> usually work it out nicely.  And, of course, to have this atomic, we
>> have to use locks, which ends up being very expensive.
>
>ARM seems to have these LDREX/STREX instructions for that purpose which
>seem to be used for generating atomic instructions without lockes. I
>guess
>other RISC architectures have similar means of doing it?

Arm isn't really risc.  Most don't.  However even with ldrex/strex you need two instructions for rmw.

James
-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity and top posting.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] memcg: remove unneeded preempt_disable
  2011-08-25 16:34           ` James Bottomley
  2011-08-25 16:34             ` James Bottomley
@ 2011-08-25 17:07             ` Christoph Lameter
  2011-08-25 18:34               ` James Bottomley
  1 sibling, 1 reply; 27+ messages in thread
From: Christoph Lameter @ 2011-08-25 17:07 UTC (permalink / raw)
  To: James Bottomley
  Cc: Peter Zijlstra, Andrew Morton, Greg Thelen, linux-kernel,
	linux-mm, KAMEZAWA Hiroyuki, Balbir Singh, Daisuke Nishimura,
	linux-arch

On Thu, 25 Aug 2011, James Bottomley wrote:

> >ARM seems to have these LDREX/STREX instructions for that purpose which
> >seem to be used for generating atomic instructions without lockes. I
> >guess
> >other RISC architectures have similar means of doing it?
>
> Arm isn't really risc.  Most don't.  However even with ldrex/strex you need two instructions for rmw.

Well then what is "really risc"? RISC is an old beaten down marketing term
AFAICT and ARM claims it too.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] memcg: remove unneeded preempt_disable
  2011-08-25 17:07             ` Christoph Lameter
@ 2011-08-25 18:34               ` James Bottomley
  2011-08-25 18:46                 ` Christoph Lameter
  0 siblings, 1 reply; 27+ messages in thread
From: James Bottomley @ 2011-08-25 18:34 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Peter Zijlstra, Andrew Morton, Greg Thelen, linux-kernel,
	linux-mm, KAMEZAWA Hiroyuki, Balbir Singh, Daisuke Nishimura,
	linux-arch



Christoph Lameter <cl@linux.com> wrote:

>On Thu, 25 Aug 2011, James Bottomley wrote:
>
>> >ARM seems to have these LDREX/STREX instructions for that purpose
>which
>> >seem to be used for generating atomic instructions without lockes. I
>> >guess
>> >other RISC architectures have similar means of doing it?
>>
>> Arm isn't really risc.  Most don't.  However even with ldrex/strex
>you need two instructions for rmw.
>
>Well then what is "really risc"? RISC is an old beaten down marketing
>term
>AFAICT and ARM claims it too.

Reduced Instruction Set Computer.  This is why we're unlikely to have complex atomic instructions: the principle of risc is that you build them up from basic ones.

James 
-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity and top posting.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] memcg: remove unneeded preempt_disable
  2011-08-25 16:31         ` Christoph Lameter
  2011-08-25 16:31           ` Christoph Lameter
  2011-08-25 16:34           ` James Bottomley
@ 2011-08-25 18:40           ` Peter Zijlstra
  2011-08-25 18:40             ` Peter Zijlstra
  2 siblings, 1 reply; 27+ messages in thread
From: Peter Zijlstra @ 2011-08-25 18:40 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: James Bottomley, Andrew Morton, Greg Thelen, linux-kernel,
	linux-mm, KAMEZAWA Hiroyuki, Balbir Singh, Daisuke Nishimura,
	linux-arch

On Thu, 2011-08-25 at 11:31 -0500, Christoph Lameter wrote:
> On Thu, 25 Aug 2011, James Bottomley wrote:
> 
> > On Thu, 2011-08-25 at 10:11 -0500, Christoph Lameter wrote:
> > > On Thu, 25 Aug 2011, Peter Zijlstra wrote:
> > >
> > > > On Thu, 2011-08-18 at 14:40 -0700, Andrew Morton wrote:
> > > > >
> > > > > I think I'll apply it, as the call frequency is low (correct?) and the
> > > > > problem will correct itself as other architectures implement their
> > > > > atomic this_cpu_foo() operations.
> > > >
> > > > Which leads me to wonder, can anything but x86 implement that this_cpu_*
> > > > muck? I doubt any of the risk chips can actually do all this.
> > > > Maybe Itanic, but then that seems to be dying fast.
> > >
> > > The cpu needs to have an RMW instruction that does something to a
> > > variable relative to a register that points to the per cpu base.
> > >
> > > Thats generally possible. The problem is how expensive the RMW is going to
> > > be.
> >
> > Risc systems generally don't have a single instruction for this, that's
> > correct.  Obviously we can do it as a non atomic sequence: read
> > variable, compute relative, read, modify, write ... but there's
> > absolutely no point hand crafting that in asm since the compiler can
> > usually work it out nicely.  And, of course, to have this atomic, we
> > have to use locks, which ends up being very expensive.
> 
> ARM seems to have these LDREX/STREX instructions for that purpose which
> seem to be used for generating atomic instructions without lockes. I guess
> other RISC architectures have similar means of doing it?

Even with LL/SC and the CPU base in a register you need to do something
like:

again:
	LL $target-reg, $cpubase-reg + offset
	<foo>
	SC $ret, $target-reg, $cpubase-reg + offset
	if !$ret goto again

Its the +offset that's problematic, it either doesn't exist or is very
limited (a quick look at the MIPS instruction set gives a limit of 64k).

Without the +offset you need:

again:
	$tmp-reg = $cpubase-reg
	$tmp-reg += offset;

	LL $target-reg, $tmp-reg
	<foo>
	SC $ret, $target-reg, $tmp-reg
	if !$ret goto again


Which is wide open to migration races. Also, very often there are
constraints on LL/SC that mandate we use preempt_disable/enable around
its use, which pretty much voids the whole purpose, since if we disable
preemption we might as well just use C (ARM belongs in this class).

It does look POWERPC's lwarx/stwcx is sane enough, although the
instruction reference I found doesn't list what happens if the LL/SC
doesn't use the same effective address or has other loads/stores in
between, if its ok with those and simply fails the SC it should be good.

Still, creating atomic ops for per-cpu ops might be more expensive than
simply doing the preempt-disable/rmw/enable dance, dunno don't know
these archs that well.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] memcg: remove unneeded preempt_disable
  2011-08-25 18:40           ` Peter Zijlstra
@ 2011-08-25 18:40             ` Peter Zijlstra
  0 siblings, 0 replies; 27+ messages in thread
From: Peter Zijlstra @ 2011-08-25 18:40 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: James Bottomley, Andrew Morton, Greg Thelen, linux-kernel,
	linux-mm, KAMEZAWA Hiroyuki, Balbir Singh, Daisuke Nishimura,
	linux-arch

On Thu, 2011-08-25 at 11:31 -0500, Christoph Lameter wrote:
> On Thu, 25 Aug 2011, James Bottomley wrote:
> 
> > On Thu, 2011-08-25 at 10:11 -0500, Christoph Lameter wrote:
> > > On Thu, 25 Aug 2011, Peter Zijlstra wrote:
> > >
> > > > On Thu, 2011-08-18 at 14:40 -0700, Andrew Morton wrote:
> > > > >
> > > > > I think I'll apply it, as the call frequency is low (correct?) and the
> > > > > problem will correct itself as other architectures implement their
> > > > > atomic this_cpu_foo() operations.
> > > >
> > > > Which leads me to wonder, can anything but x86 implement that this_cpu_*
> > > > muck? I doubt any of the risk chips can actually do all this.
> > > > Maybe Itanic, but then that seems to be dying fast.
> > >
> > > The cpu needs to have an RMW instruction that does something to a
> > > variable relative to a register that points to the per cpu base.
> > >
> > > Thats generally possible. The problem is how expensive the RMW is going to
> > > be.
> >
> > Risc systems generally don't have a single instruction for this, that's
> > correct.  Obviously we can do it as a non atomic sequence: read
> > variable, compute relative, read, modify, write ... but there's
> > absolutely no point hand crafting that in asm since the compiler can
> > usually work it out nicely.  And, of course, to have this atomic, we
> > have to use locks, which ends up being very expensive.
> 
> ARM seems to have these LDREX/STREX instructions for that purpose which
> seem to be used for generating atomic instructions without lockes. I guess
> other RISC architectures have similar means of doing it?

Even with LL/SC and the CPU base in a register you need to do something
like:

again:
	LL $target-reg, $cpubase-reg + offset
	<foo>
	SC $ret, $target-reg, $cpubase-reg + offset
	if !$ret goto again

Its the +offset that's problematic, it either doesn't exist or is very
limited (a quick look at the MIPS instruction set gives a limit of 64k).

Without the +offset you need:

again:
	$tmp-reg = $cpubase-reg
	$tmp-reg += offset;

	LL $target-reg, $tmp-reg
	<foo>
	SC $ret, $target-reg, $tmp-reg
	if !$ret goto again


Which is wide open to migration races. Also, very often there are
constraints on LL/SC that mandate we use preempt_disable/enable around
its use, which pretty much voids the whole purpose, since if we disable
preemption we might as well just use C (ARM belongs in this class).

It does look POWERPC's lwarx/stwcx is sane enough, although the
instruction reference I found doesn't list what happens if the LL/SC
doesn't use the same effective address or has other loads/stores in
between, if its ok with those and simply fails the SC it should be good.

Still, creating atomic ops for per-cpu ops might be more expensive than
simply doing the preempt-disable/rmw/enable dance, dunno don't know
these archs that well.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] memcg: remove unneeded preempt_disable
  2011-08-25 18:34               ` James Bottomley
@ 2011-08-25 18:46                 ` Christoph Lameter
  2011-08-25 18:49                   ` Peter Zijlstra
                                     ` (3 more replies)
  0 siblings, 4 replies; 27+ messages in thread
From: Christoph Lameter @ 2011-08-25 18:46 UTC (permalink / raw)
  To: James Bottomley
  Cc: Peter Zijlstra, Andrew Morton, Greg Thelen, linux-kernel,
	linux-mm, KAMEZAWA Hiroyuki, Balbir Singh, Daisuke Nishimura,
	linux-arch

On Thu, 25 Aug 2011, James Bottomley wrote:

> >Well then what is "really risc"? RISC is an old beaten down marketing
> >term
> >AFAICT and ARM claims it too.
>
> Reduced Instruction Set Computer.  This is why we're unlikely to have
> complex atomic instructions: the principle of risc is that you build
> them up from basic ones.

RISC cpus have instruction to construct complex atomic actions by the cpu
as I have shown before for ARM.

Principles always have exceptions to them.

(That statement in itself is a principle that should have an exception I
guess. But then language often only makes sense when it contains
contradictions.)

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] memcg: remove unneeded preempt_disable
  2011-08-25 18:46                 ` Christoph Lameter
@ 2011-08-25 18:49                   ` Peter Zijlstra
  2011-08-25 18:49                     ` Peter Zijlstra
  2011-08-25 18:53                   ` Peter Zijlstra
                                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 27+ messages in thread
From: Peter Zijlstra @ 2011-08-25 18:49 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: James Bottomley, Andrew Morton, Greg Thelen, linux-kernel,
	linux-mm, KAMEZAWA Hiroyuki, Balbir Singh, Daisuke Nishimura,
	linux-arch

On Thu, 2011-08-25 at 13:46 -0500, Christoph Lameter wrote:
> 
> RISC cpus have instruction to construct complex atomic actions by the cpu
> as I have shown before for ARM. 

Right, but it only makes sense if the whole thing remains cheaper than
the trivial implementation already available.

For instance, the ARM LL/SC constraints pretty much mandate we do
preempt_disable()/preempt_enable() around them, at which point the point
of doing LL/SC is gone (except maybe for the irqsafe_this_cpu_* stuff).

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] memcg: remove unneeded preempt_disable
  2011-08-25 18:49                   ` Peter Zijlstra
@ 2011-08-25 18:49                     ` Peter Zijlstra
  0 siblings, 0 replies; 27+ messages in thread
From: Peter Zijlstra @ 2011-08-25 18:49 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: James Bottomley, Andrew Morton, Greg Thelen, linux-kernel,
	linux-mm, KAMEZAWA Hiroyuki, Balbir Singh, Daisuke Nishimura,
	linux-arch

On Thu, 2011-08-25 at 13:46 -0500, Christoph Lameter wrote:
> 
> RISC cpus have instruction to construct complex atomic actions by the cpu
> as I have shown before for ARM. 

Right, but it only makes sense if the whole thing remains cheaper than
the trivial implementation already available.

For instance, the ARM LL/SC constraints pretty much mandate we do
preempt_disable()/preempt_enable() around them, at which point the point
of doing LL/SC is gone (except maybe for the irqsafe_this_cpu_* stuff).

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] memcg: remove unneeded preempt_disable
  2011-08-25 18:46                 ` Christoph Lameter
  2011-08-25 18:49                   ` Peter Zijlstra
@ 2011-08-25 18:53                   ` Peter Zijlstra
  2011-08-25 18:53                     ` Peter Zijlstra
  2011-08-25 19:05                   ` Peter Zijlstra
  2011-08-25 19:29                   ` James Bottomley
  3 siblings, 1 reply; 27+ messages in thread
From: Peter Zijlstra @ 2011-08-25 18:53 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: James Bottomley, Andrew Morton, Greg Thelen, linux-kernel,
	linux-mm, KAMEZAWA Hiroyuki, Balbir Singh, Daisuke Nishimura,
	linux-arch

On Thu, 2011-08-25 at 20:49 +0200, Peter Zijlstra wrote:
> the ARM LL/SC constraints pretty much mandate we do
> preempt_disable()/preempt_enable() around them, 

My bad, that was MIPS, it states that portable programs should not issue
load/store/prefetch insn inside a LL/SC region or the behaviour is
undefined or somesuch.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] memcg: remove unneeded preempt_disable
  2011-08-25 18:53                   ` Peter Zijlstra
@ 2011-08-25 18:53                     ` Peter Zijlstra
  0 siblings, 0 replies; 27+ messages in thread
From: Peter Zijlstra @ 2011-08-25 18:53 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: James Bottomley, Andrew Morton, Greg Thelen, linux-kernel,
	linux-mm, KAMEZAWA Hiroyuki, Balbir Singh, Daisuke Nishimura,
	linux-arch

On Thu, 2011-08-25 at 20:49 +0200, Peter Zijlstra wrote:
> the ARM LL/SC constraints pretty much mandate we do
> preempt_disable()/preempt_enable() around them, 

My bad, that was MIPS, it states that portable programs should not issue
load/store/prefetch insn inside a LL/SC region or the behaviour is
undefined or somesuch.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] memcg: remove unneeded preempt_disable
  2011-08-25 18:46                 ` Christoph Lameter
  2011-08-25 18:49                   ` Peter Zijlstra
  2011-08-25 18:53                   ` Peter Zijlstra
@ 2011-08-25 19:05                   ` Peter Zijlstra
  2011-08-25 19:19                     ` Christoph Lameter
  2011-08-25 19:29                   ` James Bottomley
  3 siblings, 1 reply; 27+ messages in thread
From: Peter Zijlstra @ 2011-08-25 19:05 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: James Bottomley, Andrew Morton, Greg Thelen, linux-kernel,
	linux-mm, KAMEZAWA Hiroyuki, Balbir Singh, Daisuke Nishimura,
	linux-arch

On Thu, 2011-08-25 at 13:46 -0500, Christoph Lameter wrote:
> 
> RISC cpus have instruction to construct complex atomic actions by the cpu
> as I have shown before for ARM. 

Also, I thought this_cpu thing's were at best locally atomic. If you
make them full blown atomic ops then even __this_cpu ops will have to be
full atomic ops, otherwise:


CPU0			CPU(1)

this_cpu_inc(&foo);	preempt_disable();
			__this_cpu_inc(&foo);
			preempt_enable();

might step on each other's toes.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] memcg: remove unneeded preempt_disable
  2011-08-25 19:05                   ` Peter Zijlstra
@ 2011-08-25 19:19                     ` Christoph Lameter
  2011-08-25 22:56                       ` Peter Zijlstra
  0 siblings, 1 reply; 27+ messages in thread
From: Christoph Lameter @ 2011-08-25 19:19 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: James Bottomley, Andrew Morton, Greg Thelen, linux-kernel,
	linux-mm, KAMEZAWA Hiroyuki, Balbir Singh, Daisuke Nishimura,
	linux-arch

On Thu, 25 Aug 2011, Peter Zijlstra wrote:

> Also, I thought this_cpu thing's were at best locally atomic. If you
> make them full blown atomic ops then even __this_cpu ops will have to be
> full atomic ops, otherwise:
>
>
> CPU0			CPU(1)
>
> this_cpu_inc(&foo);	preempt_disable();
> 			__this_cpu_inc(&foo);
> 			preempt_enable();
>
> might step on each other's toes.

They would both have their own instance of "foo". per cpu atomicity is
only one requirement of this_cpu_ops. The other is the ability to relocate
accesses relative to the current per cpu area.

Full blown atomicity is almost a superset of per cpu atomicity but its
only usable if the full atomic instructions can also relocate accesses
relative to some base.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] memcg: remove unneeded preempt_disable
  2011-08-25 18:46                 ` Christoph Lameter
                                     ` (2 preceding siblings ...)
  2011-08-25 19:05                   ` Peter Zijlstra
@ 2011-08-25 19:29                   ` James Bottomley
  2011-08-25 23:01                     ` Peter Zijlstra
  3 siblings, 1 reply; 27+ messages in thread
From: James Bottomley @ 2011-08-25 19:29 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Peter Zijlstra, Andrew Morton, Greg Thelen, linux-kernel,
	linux-mm, KAMEZAWA Hiroyuki, Balbir Singh, Daisuke Nishimura,
	linux-arch

On Thu, 2011-08-25 at 13:46 -0500, Christoph Lameter wrote:
> On Thu, 25 Aug 2011, James Bottomley wrote:
> 
> > >Well then what is "really risc"? RISC is an old beaten down marketing
> > >term
> > >AFAICT and ARM claims it too.
> >
> > Reduced Instruction Set Computer.  This is why we're unlikely to have
> > complex atomic instructions: the principle of risc is that you build
> > them up from basic ones.
> 
> RISC cpus have instruction to construct complex atomic actions by the cpu
> as I have shown before for ARM.
> 
> Principles always have exceptions to them.
> 
> (That statement in itself is a principle that should have an exception I
> guess. But then language often only makes sense when it contains
> contradictions.)

We seem to be talking at cross purposes.  I'm not saying a risc system
can't do this ... of course the risc primitives can build into whatever
you want.  To make it atomic, you simply add locking.  What I'm saying
is that open coding asm in a macro makes no sense because the compiler
will do it better from C.  Plus, since the net purpose of this patch is
to require us to lock around each op instead of doing a global lock (or
in this case preempt disable) then you're making us less efficient at
executing it.

Therefore from the risc point of view, most of the this_cpu_xxx
operations are things that we don't really care about except that the
result would be easier to read in C.

James

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] memcg: remove unneeded preempt_disable
  2011-08-25 19:19                     ` Christoph Lameter
@ 2011-08-25 22:56                       ` Peter Zijlstra
  2011-08-25 22:56                         ` Peter Zijlstra
  0 siblings, 1 reply; 27+ messages in thread
From: Peter Zijlstra @ 2011-08-25 22:56 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: James Bottomley, Andrew Morton, Greg Thelen, linux-kernel,
	linux-mm, KAMEZAWA Hiroyuki, Balbir Singh, Daisuke Nishimura,
	linux-arch

On Thu, 2011-08-25 at 14:19 -0500, Christoph Lameter wrote:
> On Thu, 25 Aug 2011, Peter Zijlstra wrote:
> 
> > Also, I thought this_cpu thing's were at best locally atomic. If you
> > make them full blown atomic ops then even __this_cpu ops will have to be
> > full atomic ops, otherwise:
> >
> >
> > CPU0			CPU(1)
> >
> > this_cpu_inc(&foo);	preempt_disable();
> > 			__this_cpu_inc(&foo);
> > 			preempt_enable();
> >
> > might step on each other's toes.
> 
> They would both have their own instance of "foo". per cpu atomicity is
> only one requirement of this_cpu_ops. The other is the ability to relocate
> accesses relative to the current per cpu area.

Ah, but not if the this_cpu_inc() thing ends up being more than a single
instruction, then you have preemption/migration windows. Only when LL/SC
can deal with SC having a different EA from the LL and supports a big
enough offset could this possibly work.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] memcg: remove unneeded preempt_disable
  2011-08-25 22:56                       ` Peter Zijlstra
@ 2011-08-25 22:56                         ` Peter Zijlstra
  0 siblings, 0 replies; 27+ messages in thread
From: Peter Zijlstra @ 2011-08-25 22:56 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: James Bottomley, Andrew Morton, Greg Thelen, linux-kernel,
	linux-mm, KAMEZAWA Hiroyuki, Balbir Singh, Daisuke Nishimura,
	linux-arch

On Thu, 2011-08-25 at 14:19 -0500, Christoph Lameter wrote:
> On Thu, 25 Aug 2011, Peter Zijlstra wrote:
> 
> > Also, I thought this_cpu thing's were at best locally atomic. If you
> > make them full blown atomic ops then even __this_cpu ops will have to be
> > full atomic ops, otherwise:
> >
> >
> > CPU0			CPU(1)
> >
> > this_cpu_inc(&foo);	preempt_disable();
> > 			__this_cpu_inc(&foo);
> > 			preempt_enable();
> >
> > might step on each other's toes.
> 
> They would both have their own instance of "foo". per cpu atomicity is
> only one requirement of this_cpu_ops. The other is the ability to relocate
> accesses relative to the current per cpu area.

Ah, but not if the this_cpu_inc() thing ends up being more than a single
instruction, then you have preemption/migration windows. Only when LL/SC
can deal with SC having a different EA from the LL and supports a big
enough offset could this possibly work.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] memcg: remove unneeded preempt_disable
  2011-08-25 19:29                   ` James Bottomley
@ 2011-08-25 23:01                     ` Peter Zijlstra
  2011-08-25 23:01                       ` Peter Zijlstra
  0 siblings, 1 reply; 27+ messages in thread
From: Peter Zijlstra @ 2011-08-25 23:01 UTC (permalink / raw)
  To: James Bottomley
  Cc: Christoph Lameter, Andrew Morton, Greg Thelen, linux-kernel,
	linux-mm, KAMEZAWA Hiroyuki, Balbir Singh, Daisuke Nishimura,
	linux-arch

On Thu, 2011-08-25 at 12:29 -0700, James Bottomley wrote:
> 
> Therefore from the risc point of view, most of the this_cpu_xxx
> operations are things that we don't really care about except that the
> result would be easier to read in C. 

Right, so the current fallback case is pretty much the optimal case for
the RISC machines, which ends up with generic code being better off not
using it much and instead preferring __this_cpu if there's more than
one.

I mean, its absolutely awesome these things are 1 instruction on x86,
but if we pessimize all other 20-odd architectures its just not cool.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] memcg: remove unneeded preempt_disable
  2011-08-25 23:01                     ` Peter Zijlstra
@ 2011-08-25 23:01                       ` Peter Zijlstra
  0 siblings, 0 replies; 27+ messages in thread
From: Peter Zijlstra @ 2011-08-25 23:01 UTC (permalink / raw)
  To: James Bottomley
  Cc: Christoph Lameter, Andrew Morton, Greg Thelen, linux-kernel,
	linux-mm, KAMEZAWA Hiroyuki, Balbir Singh, Daisuke Nishimura,
	linux-arch

On Thu, 2011-08-25 at 12:29 -0700, James Bottomley wrote:
> 
> Therefore from the risc point of view, most of the this_cpu_xxx
> operations are things that we don't really care about except that the
> result would be easier to read in C. 

Right, so the current fallback case is pretty much the optimal case for
the RISC machines, which ends up with generic code being better off not
using it much and instead preferring __this_cpu if there's more than
one.

I mean, its absolutely awesome these things are 1 instruction on x86,
but if we pessimize all other 20-odd architectures its just not cool.


^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2011-08-25 23:01 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <1313650253-21794-1-git-send-email-gthelen@google.com>
2011-08-18 21:40 ` [PATCH] memcg: remove unneeded preempt_disable Andrew Morton
2011-08-18 21:40   ` Andrew Morton
2011-08-19  0:00   ` Greg Thelen
2011-08-25 14:57   ` Peter Zijlstra
2011-08-25 15:11     ` Christoph Lameter
2011-08-25 15:11       ` Christoph Lameter
2011-08-25 16:20       ` James Bottomley
2011-08-25 16:31         ` Christoph Lameter
2011-08-25 16:31           ` Christoph Lameter
2011-08-25 16:34           ` James Bottomley
2011-08-25 16:34             ` James Bottomley
2011-08-25 17:07             ` Christoph Lameter
2011-08-25 18:34               ` James Bottomley
2011-08-25 18:46                 ` Christoph Lameter
2011-08-25 18:49                   ` Peter Zijlstra
2011-08-25 18:49                     ` Peter Zijlstra
2011-08-25 18:53                   ` Peter Zijlstra
2011-08-25 18:53                     ` Peter Zijlstra
2011-08-25 19:05                   ` Peter Zijlstra
2011-08-25 19:19                     ` Christoph Lameter
2011-08-25 22:56                       ` Peter Zijlstra
2011-08-25 22:56                         ` Peter Zijlstra
2011-08-25 19:29                   ` James Bottomley
2011-08-25 23:01                     ` Peter Zijlstra
2011-08-25 23:01                       ` Peter Zijlstra
2011-08-25 18:40           ` Peter Zijlstra
2011-08-25 18:40             ` Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox