[PATCH 2/2 v4]mm: batch activate_page() to reduce lock contention

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 2/2 v4]mm: batch activate_page() to reduce lock contention
@ 2011-03-10  5:30 Shaohua Li
  2011-03-14 14:45 ` Minchan Kim
  0 siblings, 1 reply; 10+ messages in thread
From: Shaohua Li @ 2011-03-10  5:30 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, Andi Kleen, Minchan Kim, KOSAKI Motohiro, Rik van Riel,
	mel, Johannes Weiner

The zone->lru_lock is heavily contented in workload where activate_page()
is frequently used. We could do batch activate_page() to reduce the lock
contention. The batched pages will be added into zone list when the pool
is full or page reclaim is trying to drain them.

For example, in a 4 socket 64 CPU system, create a sparse file and 64 processes,
processes shared map to the file. Each process read access the whole file and
then exit. The process exit will do unmap_vmas() and cause a lot of
activate_page() call. In such workload, we saw about 58% total time reduction
with below patch. Other workloads with a lot of activate_page also benefits a
lot too.

Andrew Morton suggested activate_page() and putback_lru_pages() should
follow the same path to active pages, but this is hard to implement (see commit
7a608572a282a). On the other hand, do we really need putback_lru_pages() to
follow the same path? I tested several FIO/FFSB benchmark (about 20 scripts for
each benchmark) in 3 machines here from 2 sockets to 4 sockets. My test doesn't
show anything significant with/without below patch (there is slight difference
but mostly some noise which we found even without below patch before). Below
patch basically returns to the same as my first post.

I tested some microbenchmarks:
case-anon-cow-rand-mt               0.58%
case-anon-cow-rand          -3.30%
case-anon-cow-seq-mt                -0.51%
case-anon-cow-seq           -5.68%
case-anon-r-rand-mt         0.23%
case-anon-r-rand            0.81%
case-anon-r-seq-mt          -0.71%
case-anon-r-seq                     -1.99%
case-anon-rx-rand-mt                2.11%
case-anon-rx-seq-mt         3.46%
case-anon-w-rand-mt         -0.03%
case-anon-w-rand            -0.50%
case-anon-w-seq-mt          -1.08%
case-anon-w-seq                     -0.12%
case-anon-wx-rand-mt                -5.02%
case-anon-wx-seq-mt         -1.43%
case-fork                   1.65%
case-fork-sleep                     -0.07%
case-fork-withmem           1.39%
case-hugetlb                        -0.59%
case-lru-file-mmap-read-mt  -0.54%
case-lru-file-mmap-read             0.61%
case-lru-file-mmap-read-rand        -2.24%
case-lru-file-readonce              -0.64%
case-lru-file-readtwice             -11.69%
case-lru-memcg                      -1.35%
case-mmap-pread-rand-mt             1.88%
case-mmap-pread-rand                -15.26%
case-mmap-pread-seq-mt              0.89%
case-mmap-pread-seq         -69.72%
case-mmap-xread-rand-mt             0.71%
case-mmap-xread-seq-mt              0.38%

The most significent are:
case-lru-file-readtwice             -11.69%
case-mmap-pread-rand                -15.26%
case-mmap-pread-seq         -69.72%

which use activate_page a lot.  others are basically variations because
each run has slightly difference.

In UP case, 'size mm/swap.o'
before the two patches:
   text    data     bss     dec     hex filename
   6466     896       4    7366    1cc6 mm/swap.o
after the two patches:
   text    data     bss     dec     hex filename
   6343     896       4    7243    1c4b mm/swap.o

Signed-off-by: Shaohua Li <shaohua.li@intel.com>

---
 mm/swap.c |   45 ++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 40 insertions(+), 5 deletions(-)

Index: linux/mm/swap.c
===================================================================
--- linux.orig/mm/swap.c	2011-03-09 12:56:09.000000000 +0800
+++ linux/mm/swap.c	2011-03-09 12:56:46.000000000 +0800
@@ -272,14 +272,10 @@ static void update_page_reclaim_stat(str
 		memcg_reclaim_stat->recent_rotated[file]++;
 }
 
-/*
- * FIXME: speed this up?
- */
-void activate_page(struct page *page)
+static void __activate_page(struct page *page, void *arg)
 {
 	struct zone *zone = page_zone(page);
 
-	spin_lock_irq(&zone->lru_lock);
 	if (PageLRU(page) && !PageActive(page) && !PageUnevictable(page)) {
 		int file = page_is_file_cache(page);
 		int lru = page_lru_base_type(page);
@@ -292,8 +288,45 @@ void activate_page(struct page *page)
 
 		update_page_reclaim_stat(zone, page, file, 1);
 	}
+}
+
+#ifdef CONFIG_SMP
+static DEFINE_PER_CPU(struct pagevec, activate_page_pvecs);
+
+static void activate_page_drain(int cpu)
+{
+	struct pagevec *pvec = &per_cpu(activate_page_pvecs, cpu);
+
+	if (pagevec_count(pvec))
+		pagevec_lru_move_fn(pvec, __activate_page, NULL);
+}
+
+void activate_page(struct page *page)
+{
+	if (PageLRU(page) && !PageActive(page) && !PageUnevictable(page)) {
+		struct pagevec *pvec = &get_cpu_var(activate_page_pvecs);
+
+		page_cache_get(page);
+		if (!pagevec_add(pvec, page))
+			pagevec_lru_move_fn(pvec, __activate_page, NULL);
+		put_cpu_var(activate_page_pvecs);
+	}
+}
+
+#else
+static inline void activate_page_drain(int cpu)
+{
+}
+
+void activate_page(struct page *page)
+{
+	struct zone *zone = page_zone(page);
+
+	spin_lock_irq(&zone->lru_lock);
+	__activate_page(page, NULL);
 	spin_unlock_irq(&zone->lru_lock);
 }
+#endif
 
 /*
  * Mark a page as having seen activity.
@@ -461,6 +494,8 @@ static void drain_cpu_pagevecs(int cpu)
 	pvec = &per_cpu(lru_deactivate_pvecs, cpu);
 	if (pagevec_count(pvec))
 		pagevec_lru_move_fn(pvec, lru_deactivate_fn, NULL);
+
+	activate_page_drain(cpu);
 }
 
 /**


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2 v4]mm: batch activate_page() to reduce lock contention
  2011-03-10  5:30 [PATCH 2/2 v4]mm: batch activate_page() to reduce lock contention Shaohua Li
@ 2011-03-14 14:45 ` Minchan Kim
  2011-03-15  1:53   ` Shaohua Li
  0 siblings, 1 reply; 10+ messages in thread
From: Minchan Kim @ 2011-03-14 14:45 UTC (permalink / raw)
  To: Shaohua Li
  Cc: Andrew Morton, linux-mm, Andi Kleen, KOSAKI Motohiro,
	Rik van Riel, mel, Johannes Weiner

On Thu, Mar 10, 2011 at 01:30:19PM +0800, Shaohua Li wrote:
> The zone->lru_lock is heavily contented in workload where activate_page()
> is frequently used. We could do batch activate_page() to reduce the lock
> contention. The batched pages will be added into zone list when the pool
> is full or page reclaim is trying to drain them.
> 
> For example, in a 4 socket 64 CPU system, create a sparse file and 64 processes,
> processes shared map to the file. Each process read access the whole file and
> then exit. The process exit will do unmap_vmas() and cause a lot of
> activate_page() call. In such workload, we saw about 58% total time reduction
> with below patch. Other workloads with a lot of activate_page also benefits a
> lot too.
> 
> Andrew Morton suggested activate_page() and putback_lru_pages() should
> follow the same path to active pages, but this is hard to implement (see commit
> 7a608572a282a). On the other hand, do we really need putback_lru_pages() to
> follow the same path? I tested several FIO/FFSB benchmark (about 20 scripts for
> each benchmark) in 3 machines here from 2 sockets to 4 sockets. My test doesn't
> show anything significant with/without below patch (there is slight difference
> but mostly some noise which we found even without below patch before). Below
> patch basically returns to the same as my first post.
> 
> I tested some microbenchmarks:
> case-anon-cow-rand-mt               0.58%
> case-anon-cow-rand          -3.30%
> case-anon-cow-seq-mt                -0.51%
> case-anon-cow-seq           -5.68%
> case-anon-r-rand-mt         0.23%
> case-anon-r-rand            0.81%
> case-anon-r-seq-mt          -0.71%
> case-anon-r-seq                     -1.99%
> case-anon-rx-rand-mt                2.11%
> case-anon-rx-seq-mt         3.46%
> case-anon-w-rand-mt         -0.03%
> case-anon-w-rand            -0.50%
> case-anon-w-seq-mt          -1.08%
> case-anon-w-seq                     -0.12%
> case-anon-wx-rand-mt                -5.02%
> case-anon-wx-seq-mt         -1.43%
> case-fork                   1.65%
> case-fork-sleep                     -0.07%
> case-fork-withmem           1.39%
> case-hugetlb                        -0.59%
> case-lru-file-mmap-read-mt  -0.54%
> case-lru-file-mmap-read             0.61%
> case-lru-file-mmap-read-rand        -2.24%
> case-lru-file-readonce              -0.64%
> case-lru-file-readtwice             -11.69%
> case-lru-memcg                      -1.35%
> case-mmap-pread-rand-mt             1.88%
> case-mmap-pread-rand                -15.26%
> case-mmap-pread-seq-mt              0.89%
> case-mmap-pread-seq         -69.72%
> case-mmap-xread-rand-mt             0.71%
> case-mmap-xread-seq-mt              0.38%
> 
> The most significent are:
> case-lru-file-readtwice             -11.69%
> case-mmap-pread-rand                -15.26%
> case-mmap-pread-seq         -69.72%
> 
> which use activate_page a lot.  others are basically variations because
> each run has slightly difference.
> 
> In UP case, 'size mm/swap.o'
> before the two patches:
>    text    data     bss     dec     hex filename
>    6466     896       4    7366    1cc6 mm/swap.o
> after the two patches:
>    text    data     bss     dec     hex filename
>    6343     896       4    7243    1c4b mm/swap.o
> 
> Signed-off-by: Shaohua Li <shaohua.li@intel.com>
> 
> ---
>  mm/swap.c |   45 ++++++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 40 insertions(+), 5 deletions(-)
> 
> Index: linux/mm/swap.c
> ===================================================================
> --- linux.orig/mm/swap.c	2011-03-09 12:56:09.000000000 +0800
> +++ linux/mm/swap.c	2011-03-09 12:56:46.000000000 +0800
> @@ -272,14 +272,10 @@ static void update_page_reclaim_stat(str
>  		memcg_reclaim_stat->recent_rotated[file]++;
>  }
>  
> -/*
> - * FIXME: speed this up?
> - */
> -void activate_page(struct page *page)
> +static void __activate_page(struct page *page, void *arg)
>  {
>  	struct zone *zone = page_zone(page);
>  
> -	spin_lock_irq(&zone->lru_lock);
>  	if (PageLRU(page) && !PageActive(page) && !PageUnevictable(page)) {
>  		int file = page_is_file_cache(page);
>  		int lru = page_lru_base_type(page);
> @@ -292,8 +288,45 @@ void activate_page(struct page *page)
>  
>  		update_page_reclaim_stat(zone, page, file, 1);
>  	}
> +}
> +
> +#ifdef CONFIG_SMP
> +static DEFINE_PER_CPU(struct pagevec, activate_page_pvecs);
> +
> +static void activate_page_drain(int cpu)
> +{
> +	struct pagevec *pvec = &per_cpu(activate_page_pvecs, cpu);
> +
> +	if (pagevec_count(pvec))
> +		pagevec_lru_move_fn(pvec, __activate_page, NULL);
> +}
> +
> +void activate_page(struct page *page)
> +{
> +	if (PageLRU(page) && !PageActive(page) && !PageUnevictable(page)) {
> +		struct pagevec *pvec = &get_cpu_var(activate_page_pvecs);
> +
> +		page_cache_get(page);
> +		if (!pagevec_add(pvec, page))
> +			pagevec_lru_move_fn(pvec, __activate_page, NULL);
> +		put_cpu_var(activate_page_pvecs);
> +	}
> +}
> +
> +#else
> +static inline void activate_page_drain(int cpu)
> +{
> +}
> +
> +void activate_page(struct page *page)
> +{
> +	struct zone *zone = page_zone(page);
> +
> +	spin_lock_irq(&zone->lru_lock);
> +	__activate_page(page, NULL);
>  	spin_unlock_irq(&zone->lru_lock);
>  }
> +#endif
 
Why do we need CONFIG_SMP in only activate_page_pvecs?
The per-cpu of activate_page_pvecs consumes lots of memory in UP?
I don't think so. But if it consumes lots of memory, it's a problem
of per-cpu. 

I can't understand why we should hanlde activate_page_pvecs specially.
Please, enlighten me. 

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2 v4]mm: batch activate_page() to reduce lock contention
  2011-03-14 14:45 ` Minchan Kim
@ 2011-03-15  1:53   ` Shaohua Li
  2011-03-15  2:12     ` Minchan Kim
  0 siblings, 1 reply; 10+ messages in thread
From: Shaohua Li @ 2011-03-15  1:53 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, linux-mm, Andi Kleen, KOSAKI Motohiro,
	Rik van Riel, mel, Johannes Weiner

On Mon, 2011-03-14 at 22:45 +0800, Minchan Kim wrote:
> On Thu, Mar 10, 2011 at 01:30:19PM +0800, Shaohua Li wrote:
> > The zone->lru_lock is heavily contented in workload where activate_page()
> > is frequently used. We could do batch activate_page() to reduce the lock
> > contention. The batched pages will be added into zone list when the pool
> > is full or page reclaim is trying to drain them.
> > 
> > For example, in a 4 socket 64 CPU system, create a sparse file and 64 processes,
> > processes shared map to the file. Each process read access the whole file and
> > then exit. The process exit will do unmap_vmas() and cause a lot of
> > activate_page() call. In such workload, we saw about 58% total time reduction
> > with below patch. Other workloads with a lot of activate_page also benefits a
> > lot too.
> > 
> > Andrew Morton suggested activate_page() and putback_lru_pages() should
> > follow the same path to active pages, but this is hard to implement (see commit
> > 7a608572a282a). On the other hand, do we really need putback_lru_pages() to
> > follow the same path? I tested several FIO/FFSB benchmark (about 20 scripts for
> > each benchmark) in 3 machines here from 2 sockets to 4 sockets. My test doesn't
> > show anything significant with/without below patch (there is slight difference
> > but mostly some noise which we found even without below patch before). Below
> > patch basically returns to the same as my first post.
> > 
> > I tested some microbenchmarks:
> > case-anon-cow-rand-mt               0.58%
> > case-anon-cow-rand          -3.30%
> > case-anon-cow-seq-mt                -0.51%
> > case-anon-cow-seq           -5.68%
> > case-anon-r-rand-mt         0.23%
> > case-anon-r-rand            0.81%
> > case-anon-r-seq-mt          -0.71%
> > case-anon-r-seq                     -1.99%
> > case-anon-rx-rand-mt                2.11%
> > case-anon-rx-seq-mt         3.46%
> > case-anon-w-rand-mt         -0.03%
> > case-anon-w-rand            -0.50%
> > case-anon-w-seq-mt          -1.08%
> > case-anon-w-seq                     -0.12%
> > case-anon-wx-rand-mt                -5.02%
> > case-anon-wx-seq-mt         -1.43%
> > case-fork                   1.65%
> > case-fork-sleep                     -0.07%
> > case-fork-withmem           1.39%
> > case-hugetlb                        -0.59%
> > case-lru-file-mmap-read-mt  -0.54%
> > case-lru-file-mmap-read             0.61%
> > case-lru-file-mmap-read-rand        -2.24%
> > case-lru-file-readonce              -0.64%
> > case-lru-file-readtwice             -11.69%
> > case-lru-memcg                      -1.35%
> > case-mmap-pread-rand-mt             1.88%
> > case-mmap-pread-rand                -15.26%
> > case-mmap-pread-seq-mt              0.89%
> > case-mmap-pread-seq         -69.72%
> > case-mmap-xread-rand-mt             0.71%
> > case-mmap-xread-seq-mt              0.38%
> > 
> > The most significent are:
> > case-lru-file-readtwice             -11.69%
> > case-mmap-pread-rand                -15.26%
> > case-mmap-pread-seq         -69.72%
> > 
> > which use activate_page a lot.  others are basically variations because
> > each run has slightly difference.
> > 
> > In UP case, 'size mm/swap.o'
> > before the two patches:
> >    text    data     bss     dec     hex filename
> >    6466     896       4    7366    1cc6 mm/swap.o
> > after the two patches:
> >    text    data     bss     dec     hex filename
> >    6343     896       4    7243    1c4b mm/swap.o
> > 
> > Signed-off-by: Shaohua Li <shaohua.li@intel.com>
> > 
> > ---
> >  mm/swap.c |   45 ++++++++++++++++++++++++++++++++++++++++-----
> >  1 file changed, 40 insertions(+), 5 deletions(-)
> > 
> > Index: linux/mm/swap.c
> > ===================================================================
> > --- linux.orig/mm/swap.c	2011-03-09 12:56:09.000000000 +0800
> > +++ linux/mm/swap.c	2011-03-09 12:56:46.000000000 +0800
> > @@ -272,14 +272,10 @@ static void update_page_reclaim_stat(str
> >  		memcg_reclaim_stat->recent_rotated[file]++;
> >  }
> >  
> > -/*
> > - * FIXME: speed this up?
> > - */
> > -void activate_page(struct page *page)
> > +static void __activate_page(struct page *page, void *arg)
> >  {
> >  	struct zone *zone = page_zone(page);
> >  
> > -	spin_lock_irq(&zone->lru_lock);
> >  	if (PageLRU(page) && !PageActive(page) && !PageUnevictable(page)) {
> >  		int file = page_is_file_cache(page);
> >  		int lru = page_lru_base_type(page);
> > @@ -292,8 +288,45 @@ void activate_page(struct page *page)
> >  
> >  		update_page_reclaim_stat(zone, page, file, 1);
> >  	}
> > +}
> > +
> > +#ifdef CONFIG_SMP
> > +static DEFINE_PER_CPU(struct pagevec, activate_page_pvecs);
> > +
> > +static void activate_page_drain(int cpu)
> > +{
> > +	struct pagevec *pvec = &per_cpu(activate_page_pvecs, cpu);
> > +
> > +	if (pagevec_count(pvec))
> > +		pagevec_lru_move_fn(pvec, __activate_page, NULL);
> > +}
> > +
> > +void activate_page(struct page *page)
> > +{
> > +	if (PageLRU(page) && !PageActive(page) && !PageUnevictable(page)) {
> > +		struct pagevec *pvec = &get_cpu_var(activate_page_pvecs);
> > +
> > +		page_cache_get(page);
> > +		if (!pagevec_add(pvec, page))
> > +			pagevec_lru_move_fn(pvec, __activate_page, NULL);
> > +		put_cpu_var(activate_page_pvecs);
> > +	}
> > +}
> > +
> > +#else
> > +static inline void activate_page_drain(int cpu)
> > +{
> > +}
> > +
> > +void activate_page(struct page *page)
> > +{
> > +	struct zone *zone = page_zone(page);
> > +
> > +	spin_lock_irq(&zone->lru_lock);
> > +	__activate_page(page, NULL);
> >  	spin_unlock_irq(&zone->lru_lock);
> >  }
> > +#endif
>  
> Why do we need CONFIG_SMP in only activate_page_pvecs?
> The per-cpu of activate_page_pvecs consumes lots of memory in UP?
> I don't think so. But if it consumes lots of memory, it's a problem
> of per-cpu. 
No, not too much memory.

> I can't understand why we should hanlde activate_page_pvecs specially.
> Please, enlighten me. 
Not it's special. akpm asked me to do it this time. Reducing little
memory is still worthy anyway, so that's it. We can do it for other
pvecs too, in separate patch.

Thanks,
Shaohua

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2 v4]mm: batch activate_page() to reduce lock contention
  2011-03-15  1:53   ` Shaohua Li
@ 2011-03-15  2:12     ` Minchan Kim
  2011-03-15  2:28       ` Andrew Morton
  2011-03-15  2:32       ` KOSAKI Motohiro
  0 siblings, 2 replies; 10+ messages in thread
From: Minchan Kim @ 2011-03-15  2:12 UTC (permalink / raw)
  To: Shaohua Li
  Cc: Andrew Morton, linux-mm, Andi Kleen, KOSAKI Motohiro,
	Rik van Riel, mel, Johannes Weiner

On Tue, Mar 15, 2011 at 10:53 AM, Shaohua Li <shaohua.li@intel.com> wrote:
> On Mon, 2011-03-14 at 22:45 +0800, Minchan Kim wrote:
>> On Thu, Mar 10, 2011 at 01:30:19PM +0800, Shaohua Li wrote:
>> > The zone->lru_lock is heavily contented in workload where activate_page()
>> > is frequently used. We could do batch activate_page() to reduce the lock
>> > contention. The batched pages will be added into zone list when the pool
>> > is full or page reclaim is trying to drain them.
>> >
>> > For example, in a 4 socket 64 CPU system, create a sparse file and 64 processes,
>> > processes shared map to the file. Each process read access the whole file and
>> > then exit. The process exit will do unmap_vmas() and cause a lot of
>> > activate_page() call. In such workload, we saw about 58% total time reduction
>> > with below patch. Other workloads with a lot of activate_page also benefits a
>> > lot too.
>> >
>> > Andrew Morton suggested activate_page() and putback_lru_pages() should
>> > follow the same path to active pages, but this is hard to implement (see commit
>> > 7a608572a282a). On the other hand, do we really need putback_lru_pages() to
>> > follow the same path? I tested several FIO/FFSB benchmark (about 20 scripts for
>> > each benchmark) in 3 machines here from 2 sockets to 4 sockets. My test doesn't
>> > show anything significant with/without below patch (there is slight difference
>> > but mostly some noise which we found even without below patch before). Below
>> > patch basically returns to the same as my first post.
>> >
>> > I tested some microbenchmarks:
>> > case-anon-cow-rand-mt               0.58%
>> > case-anon-cow-rand          -3.30%
>> > case-anon-cow-seq-mt                -0.51%
>> > case-anon-cow-seq           -5.68%
>> > case-anon-r-rand-mt         0.23%
>> > case-anon-r-rand            0.81%
>> > case-anon-r-seq-mt          -0.71%
>> > case-anon-r-seq                     -1.99%
>> > case-anon-rx-rand-mt                2.11%
>> > case-anon-rx-seq-mt         3.46%
>> > case-anon-w-rand-mt         -0.03%
>> > case-anon-w-rand            -0.50%
>> > case-anon-w-seq-mt          -1.08%
>> > case-anon-w-seq                     -0.12%
>> > case-anon-wx-rand-mt                -5.02%
>> > case-anon-wx-seq-mt         -1.43%
>> > case-fork                   1.65%
>> > case-fork-sleep                     -0.07%
>> > case-fork-withmem           1.39%
>> > case-hugetlb                        -0.59%
>> > case-lru-file-mmap-read-mt  -0.54%
>> > case-lru-file-mmap-read             0.61%
>> > case-lru-file-mmap-read-rand        -2.24%
>> > case-lru-file-readonce              -0.64%
>> > case-lru-file-readtwice             -11.69%
>> > case-lru-memcg                      -1.35%
>> > case-mmap-pread-rand-mt             1.88%
>> > case-mmap-pread-rand                -15.26%
>> > case-mmap-pread-seq-mt              0.89%
>> > case-mmap-pread-seq         -69.72%
>> > case-mmap-xread-rand-mt             0.71%
>> > case-mmap-xread-seq-mt              0.38%
>> >
>> > The most significent are:
>> > case-lru-file-readtwice             -11.69%
>> > case-mmap-pread-rand                -15.26%
>> > case-mmap-pread-seq         -69.72%
>> >
>> > which use activate_page a lot.  others are basically variations because
>> > each run has slightly difference.
>> >
>> > In UP case, 'size mm/swap.o'
>> > before the two patches:
>> >    text    data     bss     dec     hex filename
>> >    6466     896       4    7366    1cc6 mm/swap.o
>> > after the two patches:
>> >    text    data     bss     dec     hex filename
>> >    6343     896       4    7243    1c4b mm/swap.o
>> >
>> > Signed-off-by: Shaohua Li <shaohua.li@intel.com>
>> >
>> > ---
>> >  mm/swap.c |   45 ++++++++++++++++++++++++++++++++++++++++-----
>> >  1 file changed, 40 insertions(+), 5 deletions(-)
>> >
>> > Index: linux/mm/swap.c
>> > ===================================================================
>> > --- linux.orig/mm/swap.c    2011-03-09 12:56:09.000000000 +0800
>> > +++ linux/mm/swap.c 2011-03-09 12:56:46.000000000 +0800
>> > @@ -272,14 +272,10 @@ static void update_page_reclaim_stat(str
>> >             memcg_reclaim_stat->recent_rotated[file]++;
>> >  }
>> >
>> > -/*
>> > - * FIXME: speed this up?
>> > - */
>> > -void activate_page(struct page *page)
>> > +static void __activate_page(struct page *page, void *arg)
>> >  {
>> >     struct zone *zone = page_zone(page);
>> >
>> > -   spin_lock_irq(&zone->lru_lock);
>> >     if (PageLRU(page) && !PageActive(page) && !PageUnevictable(page)) {
>> >             int file = page_is_file_cache(page);
>> >             int lru = page_lru_base_type(page);
>> > @@ -292,8 +288,45 @@ void activate_page(struct page *page)
>> >
>> >             update_page_reclaim_stat(zone, page, file, 1);
>> >     }
>> > +}
>> > +
>> > +#ifdef CONFIG_SMP
>> > +static DEFINE_PER_CPU(struct pagevec, activate_page_pvecs);
>> > +
>> > +static void activate_page_drain(int cpu)
>> > +{
>> > +   struct pagevec *pvec = &per_cpu(activate_page_pvecs, cpu);
>> > +
>> > +   if (pagevec_count(pvec))
>> > +           pagevec_lru_move_fn(pvec, __activate_page, NULL);
>> > +}
>> > +
>> > +void activate_page(struct page *page)
>> > +{
>> > +   if (PageLRU(page) && !PageActive(page) && !PageUnevictable(page)) {
>> > +           struct pagevec *pvec = &get_cpu_var(activate_page_pvecs);
>> > +
>> > +           page_cache_get(page);
>> > +           if (!pagevec_add(pvec, page))
>> > +                   pagevec_lru_move_fn(pvec, __activate_page, NULL);
>> > +           put_cpu_var(activate_page_pvecs);
>> > +   }
>> > +}
>> > +
>> > +#else
>> > +static inline void activate_page_drain(int cpu)
>> > +{
>> > +}
>> > +
>> > +void activate_page(struct page *page)
>> > +{
>> > +   struct zone *zone = page_zone(page);
>> > +
>> > +   spin_lock_irq(&zone->lru_lock);
>> > +   __activate_page(page, NULL);
>> >     spin_unlock_irq(&zone->lru_lock);
>> >  }
>> > +#endif
>>
>> Why do we need CONFIG_SMP in only activate_page_pvecs?
>> The per-cpu of activate_page_pvecs consumes lots of memory in UP?
>> I don't think so. But if it consumes lots of memory, it's a problem
>> of per-cpu.
> No, not too much memory.
>
>> I can't understand why we should hanlde activate_page_pvecs specially.
>> Please, enlighten me.
> Not it's special. akpm asked me to do it this time. Reducing little
> memory is still worthy anyway, so that's it. We can do it for other
> pvecs too, in separate patch.

Understandable but I don't like code separation by CONFIG_SMP for just
little bit enhance of memory usage. In future, whenever we use percpu,
do we have to implement each functions for both SMP and non-SMP?
Is it desirable?
Andrew, Is it really valuable?

If everybody agree, I don't oppose such way.
But now I vote code cleanness than reduce memory footprint.

>
> Thanks,
> Shaohua
>
>



-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2 v4]mm: batch activate_page() to reduce lock contention
  2011-03-15  2:12     ` Minchan Kim
@ 2011-03-15  2:28       ` Andrew Morton
  2011-03-15  2:40         ` Minchan Kim
  2011-03-15  2:32       ` KOSAKI Motohiro
  1 sibling, 1 reply; 10+ messages in thread
From: Andrew Morton @ 2011-03-15  2:28 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Shaohua Li, linux-mm, Andi Kleen, KOSAKI Motohiro, Rik van Riel,
	mel, Johannes Weiner

On Tue, 15 Mar 2011 11:12:37 +0900 Minchan Kim <minchan.kim@gmail.com> wrote:

> >> I can't understand why we should hanlde activate_page_pvecs specially.
> >> Please, enlighten me.
> > Not it's special. akpm asked me to do it this time. Reducing little
> > memory is still worthy anyway, so that's it. We can do it for other
> > pvecs too, in separate patch.
> 
> Understandable but I don't like code separation by CONFIG_SMP for just
> little bit enhance of memory usage. In future, whenever we use percpu,
> do we have to implement each functions for both SMP and non-SMP?
> Is it desirable?
> Andrew, Is it really valuable?

It's a little saving of text footprint.  It's also probably faster this way -
putting all the pages into a pagevec then later processing them won't
be very L1 cache friendly.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2 v4]mm: batch activate_page() to reduce lock contention
  2011-03-15  2:28       ` Andrew Morton
@ 2011-03-15  2:40         ` Minchan Kim
  2011-03-15  2:44           ` Andrew Morton
  0 siblings, 1 reply; 10+ messages in thread
From: Minchan Kim @ 2011-03-15  2:40 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Shaohua Li, linux-mm, Andi Kleen, KOSAKI Motohiro, Rik van Riel,
	mel, Johannes Weiner

On Tue, Mar 15, 2011 at 11:28 AM, Andrew Morton
<akpm@linux-foundation.org> wrote:
> On Tue, 15 Mar 2011 11:12:37 +0900 Minchan Kim <minchan.kim@gmail.com> wrote:
>
>> >> I can't understand why we should hanlde activate_page_pvecs specially.
>> >> Please, enlighten me.
>> > Not it's special. akpm asked me to do it this time. Reducing little
>> > memory is still worthy anyway, so that's it. We can do it for other
>> > pvecs too, in separate patch.
>>
>> Understandable but I don't like code separation by CONFIG_SMP for just
>> little bit enhance of memory usage. In future, whenever we use percpu,
>> do we have to implement each functions for both SMP and non-SMP?
>> Is it desirable?
>> Andrew, Is it really valuable?
>
> It's a little saving of text footprint.  It's also probably faster this way -
> putting all the pages into a pagevec then later processing them won't
> be very L1 cache friendly.
>
>

I am not sure how much effective it is in UP. But if L1 cache friendly
is important concern, we should not use per-cpu about hot operation.
I think more important thing in embedded (normal UP), it is a lock latency.
I don't want to hold/release the lock per page.


-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2 v4]mm: batch activate_page() to reduce lock contention
  2011-03-15  2:40         ` Minchan Kim
@ 2011-03-15  2:44           ` Andrew Morton
  2011-03-15  2:59             ` Minchan Kim
  0 siblings, 1 reply; 10+ messages in thread
From: Andrew Morton @ 2011-03-15  2:44 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Shaohua Li, linux-mm, Andi Kleen, KOSAKI Motohiro, Rik van Riel,
	mel, Johannes Weiner

On Tue, 15 Mar 2011 11:40:46 +0900 Minchan Kim <minchan.kim@gmail.com> wrote:

> On Tue, Mar 15, 2011 at 11:28 AM, Andrew Morton
> <akpm@linux-foundation.org> wrote:
> > On Tue, 15 Mar 2011 11:12:37 +0900 Minchan Kim <minchan.kim@gmail.com> wrote:
> >
> >> >> I can't understand why we should hanlde activate_page_pvecs specially.
> >> >> Please, enlighten me.
> >> > Not it's special. akpm asked me to do it this time. Reducing little
> >> > memory is still worthy anyway, so that's it. We can do it for other
> >> > pvecs too, in separate patch.
> >>
> >> Understandable but I don't like code separation by CONFIG_SMP for just
> >> little bit enhance of memory usage. In future, whenever we use percpu,
> >> do we have to implement each functions for both SMP and non-SMP?
> >> Is it desirable?
> >> Andrew, Is it really valuable?
> >
> > It's a little saving of text footprint. __It's also probably faster this way -
> > putting all the pages into a pagevec then later processing them won't
> > be very L1 cache friendly.
> >
> >
> 
> I am not sure how much effective it is in UP. But if L1 cache friendly
> is important concern, we should not use per-cpu about hot operation.

It's not due to the percpu thing.  The issue is putting 14 pages into a
pagevec and then later processing them after the older ones might have
fallen out of cache.

> I think more important thing in embedded (normal UP), it is a lock latency.
> I don't want to hold/release the lock per page.

There is no lock on UP builds.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2 v4]mm: batch activate_page() to reduce lock contention
  2011-03-15  2:44           ` Andrew Morton
@ 2011-03-15  2:59             ` Minchan Kim
  0 siblings, 0 replies; 10+ messages in thread
From: Minchan Kim @ 2011-03-15  2:59 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Shaohua Li, linux-mm, Andi Kleen, KOSAKI Motohiro, Rik van Riel,
	mel, Johannes Weiner

On Tue, Mar 15, 2011 at 11:44 AM, Andrew Morton
<akpm@linux-foundation.org> wrote:
> On Tue, 15 Mar 2011 11:40:46 +0900 Minchan Kim <minchan.kim@gmail.com> wrote:
>
>> On Tue, Mar 15, 2011 at 11:28 AM, Andrew Morton
>> <akpm@linux-foundation.org> wrote:
>> > On Tue, 15 Mar 2011 11:12:37 +0900 Minchan Kim <minchan.kim@gmail.com> wrote:
>> >
>> >> >> I can't understand why we should hanlde activate_page_pvecs specially.
>> >> >> Please, enlighten me.
>> >> > Not it's special. akpm asked me to do it this time. Reducing little
>> >> > memory is still worthy anyway, so that's it. We can do it for other
>> >> > pvecs too, in separate patch.
>> >>
>> >> Understandable but I don't like code separation by CONFIG_SMP for just
>> >> little bit enhance of memory usage. In future, whenever we use percpu,
>> >> do we have to implement each functions for both SMP and non-SMP?
>> >> Is it desirable?
>> >> Andrew, Is it really valuable?
>> >
>> > It's a little saving of text footprint. __It's also probably faster this way -
>> > putting all the pages into a pagevec then later processing them won't
>> > be very L1 cache friendly.
>> >
>> >
>>
>> I am not sure how much effective it is in UP. But if L1 cache friendly
>> is important concern, we should not use per-cpu about hot operation.
>
> It's not due to the percpu thing.  The issue is putting 14 pages into a
> pagevec and then later processing them after the older ones might have
> fallen out of cache.
>
>> I think more important thing in embedded (normal UP), it is a lock latency.
>> I don't want to hold/release the lock per page.
>
> There is no lock on UP builds.
>

I mean _frequent_ irq disable.
But I don't want to bother you due to this issue as I said.
It's up to you.

If you merge the path as-is, I will help to clean up remained-things.
But at least, in my point, I don't want to add frequent irq disable.



-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2 v4]mm: batch activate_page() to reduce lock contention
  2011-03-15  2:12     ` Minchan Kim
  2011-03-15  2:28       ` Andrew Morton
@ 2011-03-15  2:32       ` KOSAKI Motohiro
  2011-03-15  2:43         ` Minchan Kim
  1 sibling, 1 reply; 10+ messages in thread
From: KOSAKI Motohiro @ 2011-03-15  2:32 UTC (permalink / raw)
  To: Minchan Kim
  Cc: kosaki.motohiro, Shaohua Li, Andrew Morton, linux-mm, Andi Kleen,
	Rik van Riel, mel, Johannes Weiner

> >> Why do we need CONFIG_SMP in only activate_page_pvecs?
> >> The per-cpu of activate_page_pvecs consumes lots of memory in UP?
> >> I don't think so. But if it consumes lots of memory, it's a problem
> >> of per-cpu.
> > No, not too much memory.
> >
> >> I can't understand why we should hanlde activate_page_pvecs specially.
> >> Please, enlighten me.
> > Not it's special. akpm asked me to do it this time. Reducing little
> > memory is still worthy anyway, so that's it. We can do it for other
> > pvecs too, in separate patch.
> 
> Understandable but I don't like code separation by CONFIG_SMP for just
> little bit enhance of memory usage. In future, whenever we use percpu,
> do we have to implement each functions for both SMP and non-SMP?
> Is it desirable?
> Andrew, Is it really valuable?
> 
> If everybody agree, I don't oppose such way.
> But now I vote code cleanness than reduce memory footprint.

FWIW, The ifdef was added for embedded concern. and I believe you are 
familiar with modern embedded trend than me. then, I have no objection
to remove it if you don't need it.

Thanks.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2 v4]mm: batch activate_page() to reduce lock contention
  2011-03-15  2:32       ` KOSAKI Motohiro
@ 2011-03-15  2:43         ` Minchan Kim
  0 siblings, 0 replies; 10+ messages in thread
From: Minchan Kim @ 2011-03-15  2:43 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Shaohua Li, Andrew Morton, linux-mm, Andi Kleen, Rik van Riel,
	mel, Johannes Weiner

On Tue, Mar 15, 2011 at 11:32 AM, KOSAKI Motohiro
<kosaki.motohiro@jp.fujitsu.com> wrote:
>> >> Why do we need CONFIG_SMP in only activate_page_pvecs?
>> >> The per-cpu of activate_page_pvecs consumes lots of memory in UP?
>> >> I don't think so. But if it consumes lots of memory, it's a problem
>> >> of per-cpu.
>> > No, not too much memory.
>> >
>> >> I can't understand why we should hanlde activate_page_pvecs specially.
>> >> Please, enlighten me.
>> > Not it's special. akpm asked me to do it this time. Reducing little
>> > memory is still worthy anyway, so that's it. We can do it for other
>> > pvecs too, in separate patch.
>>
>> Understandable but I don't like code separation by CONFIG_SMP for just
>> little bit enhance of memory usage. In future, whenever we use percpu,
>> do we have to implement each functions for both SMP and non-SMP?
>> Is it desirable?
>> Andrew, Is it really valuable?
>>
>> If everybody agree, I don't oppose such way.
>> But now I vote code cleanness than reduce memory footprint.
>
> FWIW, The ifdef was added for embedded concern. and I believe you are
> familiar with modern embedded trend than me. then, I have no objection
> to remove it if you don't need it.

I am keen in binary size but at least in this case, the benefit isn't
big, I think.
I hope we would care of code cleanness and latency of irq than memory
footprint in this time.

>
> Thanks.
>
>
>



-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2011-03-15  2:59 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-03-10  5:30 [PATCH 2/2 v4]mm: batch activate_page() to reduce lock contention Shaohua Li
2011-03-14 14:45 ` Minchan Kim
2011-03-15  1:53   ` Shaohua Li
2011-03-15  2:12     ` Minchan Kim
2011-03-15  2:28       ` Andrew Morton
2011-03-15  2:40         ` Minchan Kim
2011-03-15  2:44           ` Andrew Morton
2011-03-15  2:59             ` Minchan Kim
2011-03-15  2:32       ` KOSAKI Motohiro
2011-03-15  2:43         ` Minchan Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).