All of lore.kernel.org
 help / color / mirror / Atom feed
From: Cody P Schafer <cody@linux.vnet.ibm.com>
To: Gilad Ben-Yossef <gilad@benyossef.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Simon Jeons <simon.jeons@gmail.com>,
	KOSAKI Motohiro <kosaki.motohiro@gmail.com>,
	Mel Gorman <mgorman@suse.de>, Linux MM <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v2 03/10] mm/page_alloc: insert memory barriers to allow async update of pcp batch and high
Date: Wed, 10 Apr 2013 11:31:32 -0700	[thread overview]
Message-ID: <5165B004.3080100@linux.vnet.ibm.com> (raw)
In-Reply-To: <CAOtvUMe8zZwZaUYDiGeLskkdPzPZGXYh6Wm0MKt0St0OSqDExg@mail.gmail.com>

On 04/09/2013 11:22 PM, Gilad Ben-Yossef wrote:
> On Wed, Apr 10, 2013 at 9:19 AM, Gilad Ben-Yossef <gilad@benyossef.com> wrote:
>> On Wed, Apr 10, 2013 at 2:28 AM, Cody P Schafer <cody@linux.vnet.ibm.com> wrote:
>>> In pageset_set_batch() and setup_pagelist_highmark(), ensure that batch
>>> is always set to a safe value (1) prior to updating high, and ensure
>>> that high is fully updated before setting the real value of batch.
>>>
>>> Suggested by Gilad Ben-Yossef <gilad@benyossef.com> in this thread:
>>>
>>>          https://lkml.org/lkml/2013/4/9/23
>>>
>>> Also reproduces his proposed comment.
>>>
>>> Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com>
>>> ---
>>>   mm/page_alloc.c | 19 +++++++++++++++++++
>>>   1 file changed, 19 insertions(+)
>>>
>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>>> index d259599..a07bd4c 100644
>>> --- a/mm/page_alloc.c
>>> +++ b/mm/page_alloc.c
>>> @@ -4007,11 +4007,26 @@ static int __meminit zone_batchsize(struct zone *zone)
>>>   #endif
>>>   }
>>>
>>> +static void pageset_update_prep(struct per_cpu_pages *pcp)
>>> +{
>>> +       /*
>>> +        * We're about to mess with PCP in an non atomic fashion.  Put an
>>> +        * intermediate safe value of batch and make sure it is visible before
>>> +        * any other change
>>> +        */
>>> +       pcp->batch = 1;
>>> +       smp_wmb();
>>> +}
>>> +
>>>   /* a companion to setup_pagelist_highmark() */
>>>   static void pageset_set_batch(struct per_cpu_pageset *p, unsigned long batch)
>>>   {
>>>          struct per_cpu_pages *pcp = &p->pcp;
>>> +       pageset_update_prep(pcp);
>>> +
>>>          pcp->high = 6 * batch;
>>> +       smp_wmb();
>>> +
>>>          pcp->batch = max(1UL, 1 * batch);
>>>   }
>>>
>>> @@ -4039,7 +4054,11 @@ static void setup_pagelist_highmark(struct per_cpu_pageset *p,
>>>          struct per_cpu_pages *pcp;
>>>
>>>          pcp = &p->pcp;
>>> +       pageset_update_prep(pcp);
>>> +
>>>          pcp->high = high;
>>> +       smp_wmb();
>>> +
>>>          pcp->batch = max(1UL, high/4);
>>>          if ((high/4) > (PAGE_SHIFT * 8))
>>>                  pcp->batch = PAGE_SHIFT * 8;
>>> --
>>> 1.8.2
>>>
>>
>> That is very good.
>> However, now we've created a "protocol" for updating ->high and ->batch:
>>
>> 1. Call pageset_update_prep(pcp)
>> 2. Update ->high
>> 3. smp_wmb()
>> 4. Update ->batch
>>
>> But that protocol is not documented anywhere and someone  that reads
>> the code two
>> years from now will not be aware of it or why it is done like that.
>>
>> How about if we create:
>>
>> /*
>> * pcp->high and pcp->batch values are related and dependent on one another:
>> * ->batch must never be higher then ->high.
>> * The following function updates them in a safe manner without a
>> costly atomic transaction.
>> */
>> static void pageset_update(struct per_cpu_pages *pcp, unsigned int
>> high, unsigned int batch)
>> {
>>         /* start with a fail safe value for batch */
>>         pcp->batch = 1;
>>         smp_wmb();
>>
>>         /* Update high, then batch, in order */
>>         pcp->high = high;
>>         smp_wmb();
>>         pcp->batch = batch;
>> }
>>
>> And use that at the update sites? then the protocol becomes explicit.

Yep, this looks like exactly the right thing.

>
> Oh, and other then that it looks good to me, so assuming you do that -
>
> Reviewed-By: Gilad Ben-Yossef <gilad@benyossef.com>

I've added it only to the patch with pageset_update() in it, if you 
meant to apply it to more patches, feel free to reply to the v3 posting.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Cody P Schafer <cody@linux.vnet.ibm.com>
To: Gilad Ben-Yossef <gilad@benyossef.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Simon Jeons <simon.jeons@gmail.com>,
	KOSAKI Motohiro <kosaki.motohiro@gmail.com>,
	Mel Gorman <mgorman@suse.de>, Linux MM <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v2 03/10] mm/page_alloc: insert memory barriers to allow async update of pcp batch and high
Date: Wed, 10 Apr 2013 11:31:32 -0700	[thread overview]
Message-ID: <5165B004.3080100@linux.vnet.ibm.com> (raw)
In-Reply-To: <CAOtvUMe8zZwZaUYDiGeLskkdPzPZGXYh6Wm0MKt0St0OSqDExg@mail.gmail.com>

On 04/09/2013 11:22 PM, Gilad Ben-Yossef wrote:
> On Wed, Apr 10, 2013 at 9:19 AM, Gilad Ben-Yossef <gilad@benyossef.com> wrote:
>> On Wed, Apr 10, 2013 at 2:28 AM, Cody P Schafer <cody@linux.vnet.ibm.com> wrote:
>>> In pageset_set_batch() and setup_pagelist_highmark(), ensure that batch
>>> is always set to a safe value (1) prior to updating high, and ensure
>>> that high is fully updated before setting the real value of batch.
>>>
>>> Suggested by Gilad Ben-Yossef <gilad@benyossef.com> in this thread:
>>>
>>>          https://lkml.org/lkml/2013/4/9/23
>>>
>>> Also reproduces his proposed comment.
>>>
>>> Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com>
>>> ---
>>>   mm/page_alloc.c | 19 +++++++++++++++++++
>>>   1 file changed, 19 insertions(+)
>>>
>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>>> index d259599..a07bd4c 100644
>>> --- a/mm/page_alloc.c
>>> +++ b/mm/page_alloc.c
>>> @@ -4007,11 +4007,26 @@ static int __meminit zone_batchsize(struct zone *zone)
>>>   #endif
>>>   }
>>>
>>> +static void pageset_update_prep(struct per_cpu_pages *pcp)
>>> +{
>>> +       /*
>>> +        * We're about to mess with PCP in an non atomic fashion.  Put an
>>> +        * intermediate safe value of batch and make sure it is visible before
>>> +        * any other change
>>> +        */
>>> +       pcp->batch = 1;
>>> +       smp_wmb();
>>> +}
>>> +
>>>   /* a companion to setup_pagelist_highmark() */
>>>   static void pageset_set_batch(struct per_cpu_pageset *p, unsigned long batch)
>>>   {
>>>          struct per_cpu_pages *pcp = &p->pcp;
>>> +       pageset_update_prep(pcp);
>>> +
>>>          pcp->high = 6 * batch;
>>> +       smp_wmb();
>>> +
>>>          pcp->batch = max(1UL, 1 * batch);
>>>   }
>>>
>>> @@ -4039,7 +4054,11 @@ static void setup_pagelist_highmark(struct per_cpu_pageset *p,
>>>          struct per_cpu_pages *pcp;
>>>
>>>          pcp = &p->pcp;
>>> +       pageset_update_prep(pcp);
>>> +
>>>          pcp->high = high;
>>> +       smp_wmb();
>>> +
>>>          pcp->batch = max(1UL, high/4);
>>>          if ((high/4) > (PAGE_SHIFT * 8))
>>>                  pcp->batch = PAGE_SHIFT * 8;
>>> --
>>> 1.8.2
>>>
>>
>> That is very good.
>> However, now we've created a "protocol" for updating ->high and ->batch:
>>
>> 1. Call pageset_update_prep(pcp)
>> 2. Update ->high
>> 3. smp_wmb()
>> 4. Update ->batch
>>
>> But that protocol is not documented anywhere and someone  that reads
>> the code two
>> years from now will not be aware of it or why it is done like that.
>>
>> How about if we create:
>>
>> /*
>> * pcp->high and pcp->batch values are related and dependent on one another:
>> * ->batch must never be higher then ->high.
>> * The following function updates them in a safe manner without a
>> costly atomic transaction.
>> */
>> static void pageset_update(struct per_cpu_pages *pcp, unsigned int
>> high, unsigned int batch)
>> {
>>         /* start with a fail safe value for batch */
>>         pcp->batch = 1;
>>         smp_wmb();
>>
>>         /* Update high, then batch, in order */
>>         pcp->high = high;
>>         smp_wmb();
>>         pcp->batch = batch;
>> }
>>
>> And use that at the update sites? then the protocol becomes explicit.

Yep, this looks like exactly the right thing.

>
> Oh, and other then that it looks good to me, so assuming you do that -
>
> Reviewed-By: Gilad Ben-Yossef <gilad@benyossef.com>

I've added it only to the patch with pageset_update() in it, if you 
meant to apply it to more patches, feel free to reply to the v3 posting.


  reply	other threads:[~2013-04-10 18:32 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-09 23:28 [PATCH v2 00/10] mm: fixup changers of per cpu pageset's ->high and ->batch Cody P Schafer
2013-04-09 23:28 ` Cody P Schafer
2013-04-09 23:28 ` [PATCH v2 01/10] mm/page_alloc: factor out setting of pcp->high and pcp->batch Cody P Schafer
2013-04-09 23:28   ` Cody P Schafer
2013-04-09 23:28 ` [PATCH v2 02/10] mm/page_alloc: prevent concurrent updaters of pcp ->batch and ->high Cody P Schafer
2013-04-09 23:28   ` Cody P Schafer
2013-04-09 23:28 ` [PATCH v2 03/10] mm/page_alloc: insert memory barriers to allow async update of pcp batch and high Cody P Schafer
2013-04-09 23:28   ` Cody P Schafer
2013-04-10  6:19   ` Gilad Ben-Yossef
2013-04-10  6:19     ` Gilad Ben-Yossef
2013-04-10  6:22     ` Gilad Ben-Yossef
2013-04-10  6:22       ` Gilad Ben-Yossef
2013-04-10 18:31       ` Cody P Schafer [this message]
2013-04-10 18:31         ` Cody P Schafer
2013-04-09 23:28 ` [PATCH v2 04/10] mm/page_alloc: convert zone_pcp_update() to rely on memory barriers instead of stop_machine() Cody P Schafer
2013-04-09 23:28   ` Cody P Schafer
2013-04-09 23:28 ` [PATCH v2 05/10] mm/page_alloc: when handling percpu_pagelist_fraction, don't unneedly recalulate high Cody P Schafer
2013-04-09 23:28   ` Cody P Schafer
2013-04-09 23:28 ` [PATCH v2 06/10] mm/page_alloc: factor setup_pageset() into pageset_init() and pageset_set_batch() Cody P Schafer
2013-04-09 23:28   ` Cody P Schafer
2013-04-09 23:28 ` [PATCH v2 07/10] mm/page_alloc: relocate comment to be directly above code it refers to Cody P Schafer
2013-04-09 23:28   ` Cody P Schafer
2013-04-09 23:28 ` [PATCH v2 08/10] mm/page_alloc: factor zone_pageset_init() out of setup_zone_pageset() Cody P Schafer
2013-04-09 23:28   ` Cody P Schafer
2013-04-09 23:28 ` [PATCH v2 09/10] mm/page_alloc: in zone_pcp_update(), uze zone_pageset_init() Cody P Schafer
2013-04-09 23:28   ` Cody P Schafer
2013-04-09 23:28 ` [PATCH v2 10/10] mm/page_alloc: rename setup_pagelist_highmark() to match naming of pageset_set_batch() Cody P Schafer
2013-04-09 23:28   ` Cody P Schafer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5165B004.3080100@linux.vnet.ibm.com \
    --to=cody@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=gilad@benyossef.com \
    --cc=kosaki.motohiro@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=simon.jeons@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.