All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: Petr Holasek <pholasek@redhat.com>
Cc: Hugh Dickins <hughd@google.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Chris Wright <chrisw@sous-sol.org>,
	Izik Eidus <izik.eidus@ravellosystems.com>,
	Rik van Riel <riel@redhat.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Anton Arapov <anton@redhat.com>
Subject: Re: [PATCH v2] KSM: numa awareness sysfs knob
Date: Fri, 29 Jun 2012 14:17:59 -0700	[thread overview]
Message-ID: <20120629141759.3312b49e.akpm@linux-foundation.org> (raw)
In-Reply-To: <1340970592-25001-1-git-send-email-pholasek@redhat.com>

On Fri, 29 Jun 2012 13:49:52 +0200
Petr Holasek <pholasek@redhat.com> wrote:

> Introduces new sysfs boolean knob /sys/kernel/mm/ksm/merge_nodes
> which control merging pages across different numa nodes.
> When it is set to zero only pages from the same node are merged,
> otherwise pages from all nodes can be merged together (default behavior).
> 
> Typical use-case could be a lot of KVM guests on NUMA machine
> and cpus from more distant nodes would have significant increase
> of access latency to the merged ksm page. Sysfs knob was choosen
> for higher scalability.
> 
> Every numa node has its own stable & unstable trees because
> of faster searching and inserting. Changing of merge_nodes
> value is possible only when there are not any ksm shared pages in system.

It would be neat to have a knob which enables KSM for all anon
mappings.  ie: pretend that MADV_MERGEABLE is always set.  For testing
coverage purposes.

> I've tested this patch on numa machines with 2, 4 and 8 nodes and
> measured speed of memory access inside of KVM guests with memory pinned
> to one of nodes with this benchmark:
> 
> http://pholasek.fedorapeople.org/alloc_pg.c
> 
> Population standard deviations of access times in percentage of average
> were following:
> 
> merge_nodes=1
> 2 nodes 1.4%
> 4 nodes 1.6%
> 8 nodes	1.7%
> 
> merge_nodes=0
> 2 nodes	1%
> 4 nodes	0.32%
> 8 nodes	0.018%

ooh, numbers!  Thanks.

> --- a/Documentation/vm/ksm.txt
> +++ b/Documentation/vm/ksm.txt
> @@ -58,6 +58,12 @@ sleep_millisecs  - how many milliseconds ksmd should sleep before next scan
>                     e.g. "echo 20 > /sys/kernel/mm/ksm/sleep_millisecs"
>                     Default: 20 (chosen for demonstration purposes)
>  
> +merge_nodes      - specifies if pages from different numa nodes can be merged.
> +                   When set to 0, ksm merges only pages which physically
> +                   resides in the memory area of same NUMA node. It brings
> +                   lower latency to access to shared page.
> +                   Default: 1

s/resides/reside/.

This doc should mention that /sys/kernel/mm/ksm/run should be zeroed to
alter merge_nodes.  Otherwise confusion will reign.

>
> ...
>
> +static ssize_t merge_nodes_store(struct kobject *kobj,
> +				   struct kobj_attribute *attr,
> +				   const char *buf, size_t count)
> +{
> +	int err;
> +	unsigned long knob;
> +
> +	err = kstrtoul(buf, 10, &knob);
> +	if (err)
> +		return err;
> +	if (knob > 1)
> +		return -EINVAL;
> +
> +	if (ksm_run & KSM_RUN_MERGE)
> +		return -EBUSY;
> +
> +	mutex_lock(&ksm_thread_mutex);
> +	if (ksm_merge_nodes != knob) {
> +		if (ksm_pages_shared > 0)
> +			return -EBUSY;
> +		else
> +			ksm_merge_nodes = knob;
> +	}
> +	mutex_unlock(&ksm_thread_mutex);
> +
> +	return count;
> +}

Seems a bit racy.  Shouldn't the test of ksm_run be inside the locked
region?

> +KSM_ATTR(merge_nodes);
> +#endif
>
> ...
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Andrew Morton <akpm@linux-foundation.org>
To: Petr Holasek <pholasek@redhat.com>
Cc: Hugh Dickins <hughd@google.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Chris Wright <chrisw@sous-sol.org>,
	Izik Eidus <izik.eidus@ravellosystems.com>,
	Rik van Riel <riel@redhat.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Anton Arapov <anton@redhat.com>
Subject: Re: [PATCH v2] KSM: numa awareness sysfs knob
Date: Fri, 29 Jun 2012 14:17:59 -0700	[thread overview]
Message-ID: <20120629141759.3312b49e.akpm@linux-foundation.org> (raw)
In-Reply-To: <1340970592-25001-1-git-send-email-pholasek@redhat.com>

On Fri, 29 Jun 2012 13:49:52 +0200
Petr Holasek <pholasek@redhat.com> wrote:

> Introduces new sysfs boolean knob /sys/kernel/mm/ksm/merge_nodes
> which control merging pages across different numa nodes.
> When it is set to zero only pages from the same node are merged,
> otherwise pages from all nodes can be merged together (default behavior).
> 
> Typical use-case could be a lot of KVM guests on NUMA machine
> and cpus from more distant nodes would have significant increase
> of access latency to the merged ksm page. Sysfs knob was choosen
> for higher scalability.
> 
> Every numa node has its own stable & unstable trees because
> of faster searching and inserting. Changing of merge_nodes
> value is possible only when there are not any ksm shared pages in system.

It would be neat to have a knob which enables KSM for all anon
mappings.  ie: pretend that MADV_MERGEABLE is always set.  For testing
coverage purposes.

> I've tested this patch on numa machines with 2, 4 and 8 nodes and
> measured speed of memory access inside of KVM guests with memory pinned
> to one of nodes with this benchmark:
> 
> http://pholasek.fedorapeople.org/alloc_pg.c
> 
> Population standard deviations of access times in percentage of average
> were following:
> 
> merge_nodes=1
> 2 nodes 1.4%
> 4 nodes 1.6%
> 8 nodes	1.7%
> 
> merge_nodes=0
> 2 nodes	1%
> 4 nodes	0.32%
> 8 nodes	0.018%

ooh, numbers!  Thanks.

> --- a/Documentation/vm/ksm.txt
> +++ b/Documentation/vm/ksm.txt
> @@ -58,6 +58,12 @@ sleep_millisecs  - how many milliseconds ksmd should sleep before next scan
>                     e.g. "echo 20 > /sys/kernel/mm/ksm/sleep_millisecs"
>                     Default: 20 (chosen for demonstration purposes)
>  
> +merge_nodes      - specifies if pages from different numa nodes can be merged.
> +                   When set to 0, ksm merges only pages which physically
> +                   resides in the memory area of same NUMA node. It brings
> +                   lower latency to access to shared page.
> +                   Default: 1

s/resides/reside/.

This doc should mention that /sys/kernel/mm/ksm/run should be zeroed to
alter merge_nodes.  Otherwise confusion will reign.

>
> ...
>
> +static ssize_t merge_nodes_store(struct kobject *kobj,
> +				   struct kobj_attribute *attr,
> +				   const char *buf, size_t count)
> +{
> +	int err;
> +	unsigned long knob;
> +
> +	err = kstrtoul(buf, 10, &knob);
> +	if (err)
> +		return err;
> +	if (knob > 1)
> +		return -EINVAL;
> +
> +	if (ksm_run & KSM_RUN_MERGE)
> +		return -EBUSY;
> +
> +	mutex_lock(&ksm_thread_mutex);
> +	if (ksm_merge_nodes != knob) {
> +		if (ksm_pages_shared > 0)
> +			return -EBUSY;
> +		else
> +			ksm_merge_nodes = knob;
> +	}
> +	mutex_unlock(&ksm_thread_mutex);
> +
> +	return count;
> +}

Seems a bit racy.  Shouldn't the test of ksm_run be inside the locked
region?

> +KSM_ATTR(merge_nodes);
> +#endif
>
> ...
>

  parent reply	other threads:[~2012-06-29 21:18 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-06-29 11:49 [PATCH v2] KSM: numa awareness sysfs knob Petr Holasek
2012-06-29 11:49 ` Petr Holasek
2012-06-29 13:03 ` Cong Wang
2012-06-29 13:23   ` Petr Holasek
2012-06-29 13:23     ` Petr Holasek
2012-06-29 16:05 ` Johannes Weiner
2012-06-29 16:05   ` Johannes Weiner
2012-06-29 16:30   ` Petr Holasek
2012-06-29 16:30     ` Petr Holasek
2012-06-29 16:47     ` Johannes Weiner
2012-06-29 16:47       ` Johannes Weiner
2012-06-29 22:30       ` David Rientjes
2012-06-29 22:30         ` David Rientjes
2012-06-30 11:40         ` Petr Holasek
2012-06-30 11:40           ` Petr Holasek
2012-07-02 21:26           ` David Rientjes
2012-07-02 21:26             ` David Rientjes
2012-07-03 17:02             ` Petr Holasek
2012-07-03 17:02               ` Petr Holasek
2012-06-29 21:17 ` Andrew Morton [this message]
2012-06-29 21:17   ` Andrew Morton
2012-06-29 22:50   ` David Rientjes
2012-06-29 22:50     ` David Rientjes
2012-06-30  9:43     ` Izik Eidus
2012-06-30  9:43       ` Izik Eidus
2012-06-30 12:29   ` Petr Holasek
2012-06-30 12:29     ` Petr Holasek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120629141759.3312b49e.akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=aarcange@redhat.com \
    --cc=anton@redhat.com \
    --cc=chrisw@sous-sol.org \
    --cc=hughd@google.com \
    --cc=izik.eidus@ravellosystems.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=pholasek@redhat.com \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.