From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail172.messagelabs.com (mail172.messagelabs.com [216.82.254.3]) by kanga.kvack.org (Postfix) with ESMTP id DE149900086 for ; Fri, 15 Apr 2011 17:47:07 -0400 (EDT) Received: from hpaq12.eem.corp.google.com (hpaq12.eem.corp.google.com [172.25.149.12]) by smtp-out.google.com with ESMTP id p3FLl19X022400 for ; Fri, 15 Apr 2011 14:47:05 -0700 Received: from qyk36 (qyk36.prod.google.com [10.241.83.164]) by hpaq12.eem.corp.google.com with ESMTP id p3FLkTWU019614 (version=TLSv1/SSLv3 cipher=RC4-SHA bits=128 verify=NOT) for ; Fri, 15 Apr 2011 14:47:00 -0700 Received: by qyk36 with SMTP id 36so7548qyk.4 for ; Fri, 15 Apr 2011 14:46:59 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <20110415131617.91b0485c.kamezawa.hiroyu@jp.fujitsu.com> References: <1302821669-29862-1-git-send-email-yinghan@google.com> <1302821669-29862-2-git-send-email-yinghan@google.com> <20110415090445.4578f987.kamezawa.hiroyu@jp.fujitsu.com> <20110415131617.91b0485c.kamezawa.hiroyu@jp.fujitsu.com> Date: Fri, 15 Apr 2011 14:46:59 -0700 Message-ID: Subject: Re: [PATCH V4 01/10] Add kswapd descriptor From: Ying Han Content-Type: multipart/alternative; boundary=000e0cd68ee03dbcbf04a0fbfb95 Sender: owner-linux-mm@kvack.org List-ID: To: KAMEZAWA Hiroyuki Cc: KOSAKI Motohiro , Minchan Kim , Daisuke Nishimura , Balbir Singh , Pavel Emelyanov , Andrew Morton , Li Zefan , Mel Gorman , Christoph Lameter , Johannes Weiner , Rik van Riel , Hugh Dickins , Michal Hocko , Dave Hansen , Zhu Yanhai , linux-mm@kvack.org --000e0cd68ee03dbcbf04a0fbfb95 Content-Type: text/plain; charset=ISO-8859-1 On Thu, Apr 14, 2011 at 9:16 PM, KAMEZAWA Hiroyuki < kamezawa.hiroyu@jp.fujitsu.com> wrote: > On Thu, 14 Apr 2011 20:35:00 -0700 > Ying Han wrote: > > > On Thu, Apr 14, 2011 at 5:04 PM, KAMEZAWA Hiroyuki < > > kamezawa.hiroyu@jp.fujitsu.com> wrote: > > > > > On Thu, 14 Apr 2011 15:54:20 -0700 > > > Ying Han wrote: > > > > > > > There is a kswapd kernel thread for each numa node. We will add a > > > different > > > > kswapd for each memcg. The kswapd is sleeping in the wait queue > headed at > > > > kswapd_wait field of a kswapd descriptor. The kswapd descriptor > stores > > > > information of node or memcg and it allows the global and per-memcg > > > background > > > > reclaim to share common reclaim algorithms. > > > > > > > > This patch adds the kswapd descriptor and moves the per-node kswapd > to > > > use the > > > > new structure. > > > > > > > > > > No objections to your direction but some comments. > > > > > > > changelog v2..v1: > > > > 1. dynamic allocate kswapd descriptor and initialize the > wait_queue_head > > > of pgdat > > > > at kswapd_run. > > > > 2. add helper macro is_node_kswapd to distinguish per-node/per-cgroup > > > kswapd > > > > descriptor. > > > > > > > > changelog v3..v2: > > > > 1. move the struct mem_cgroup *kswapd_mem in kswapd sruct to later > patch. > > > > 2. rename thr in kswapd_run to something else. > > > > > > > > Signed-off-by: Ying Han > > > > --- > > > > include/linux/mmzone.h | 3 +- > > > > include/linux/swap.h | 7 ++++ > > > > mm/page_alloc.c | 1 - > > > > mm/vmscan.c | 95 > > > ++++++++++++++++++++++++++++++++++++------------ > > > > 4 files changed, 80 insertions(+), 26 deletions(-) > > > > > > > > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h > > > > index 628f07b..6cba7d2 100644 > > > > --- a/include/linux/mmzone.h > > > > +++ b/include/linux/mmzone.h > > > > @@ -640,8 +640,7 @@ typedef struct pglist_data { > > > > unsigned long node_spanned_pages; /* total size of physical > page > > > > range, including holes */ > > > > int node_id; > > > > - wait_queue_head_t kswapd_wait; > > > > - struct task_struct *kswapd; > > > > + wait_queue_head_t *kswapd_wait; > > > > int kswapd_max_order; > > > > enum zone_type classzone_idx; > > > > > > I think pg_data_t should include struct kswapd in it, as > > > > > > struct pglist_data { > > > ..... > > > struct kswapd kswapd; > > > }; > > > and you can add a macro as > > > > > > #define kswapd_waitqueue(kswapd) (&(kswapd)->kswapd_wait) > > > if it looks better. > > > > > > Why I recommend this is I think it's better to have 'struct kswapd' > > > on the same page of pg_data_t or struct memcg. > > > Do you have benefits to kmalloc() struct kswapd on damand ? > > > > > > > So we don't end of have kswapd struct on memcgs' which doesn't have > > per-memcg kswapd enabled. I don't see one is strongly better than the > other > > for the two approaches. If ok, I would like to keep as it is for this > > verion. Hope this is ok for now. > > > > My intension is to remove kswapd_spinlock. Can we remove it with > dynamic allocation ? IOW, static allocation still requires spinlock ? > Thank you for pointing that out which made me thinking a little harder on this. I don't think we need the spinlock in this patch. This is something I inherited from another kswapd patch we did where we allow one kswapd to reclaim from multiple pgdat. We need the spinlock there since we need to protect the pgdat list per kswapd. However, we have one-to-one mapping here and we can get rid of the lock. I will remove it on the next post. --Ying > > Thanks, > -Kame > > > --000e0cd68ee03dbcbf04a0fbfb95 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable

On Thu, Apr 14, 2011 at 9:16 PM, KAMEZAW= A Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
On Thu, 14 Apr 2011 20:35:00 -0700
Ying Han <yinghan@google.com> wrote:

> On Thu, Apr 14, 2011 at 5:04 PM, KAMEZAWA Hiroyuki <
> kamezawa.hiroyu@jp.f= ujitsu.com> wrote:
>
> > On Thu, 14 Apr 2011 15:54:20 -0700
> > Ying Han <yinghan@google= .com> wrote:
> >
> > > There is a kswapd kernel thread for each numa node. We will = add a
> > different
> > > kswapd for each memcg. The kswapd is sleeping in the wait qu= eue headed at
> > > kswapd_wait field of a kswapd descriptor. The kswapd descrip= tor stores
> > > information of node or memcg and it allows the global and pe= r-memcg
> > background
> > > reclaim to share common reclaim algorithms.
> > >
> > > This patch adds the kswapd descriptor and moves the per-node= kswapd to
> > use the
> > > new structure.
> > >
> >
> > No objections to your direction but some comments.
> >
> > > changelog v2..v1:
> > > 1. dynamic allocate kswapd descriptor and initialize the wai= t_queue_head
> > of pgdat
> > > at kswapd_run.
> > > 2. add helper macro is_node_kswapd to distinguish per-node/p= er-cgroup
> > kswapd
> > > descriptor.
> > >
> > > changelog v3..v2:
> > > 1. move the struct mem_cgroup *kswapd_mem in kswapd sruct to= later patch.
> > > 2. rename thr in kswapd_run to something else.
> > >
> > > Signed-off-by: Ying Han <yinghan@google.com>
> > > ---
> > > =A0include/linux/mmzone.h | =A0 =A03 +-
> > > =A0include/linux/swap.h =A0 | =A0 =A07 ++++
> > > =A0mm/page_alloc.c =A0 =A0 =A0 =A0| =A0 =A01 -
> > > =A0mm/vmscan.c =A0 =A0 =A0 =A0 =A0 =A0| =A0 95
> > ++++++++++++++++++++++++++++++++++++------------
> > > =A04 files changed, 80 insertions(+), 26 deletions(-)
> > >
> > > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h=
> > > index 628f07b..6cba7d2 100644
> > > --- a/include/linux/mmzone.h
> > > +++ b/include/linux/mmzone.h
> > > @@ -640,8 +640,7 @@ typedef struct pglist_data {
> > > =A0 =A0 =A0 unsigned long node_spanned_pages; /* total size = of physical page
> > > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A0 =A0 =A0 =A0 =A0range, including holes */
> > > =A0 =A0 =A0 int node_id;
> > > - =A0 =A0 wait_queue_head_t kswapd_wait;
> > > - =A0 =A0 struct task_struct *kswapd;
> > > + =A0 =A0 wait_queue_head_t *kswapd_wait;
> > > =A0 =A0 =A0 int kswapd_max_order;
> > > =A0 =A0 =A0 enum zone_type classzone_idx;
> >
> > I think pg_data_t should include struct kswapd in it, as
> >
> > =A0 =A0 =A0 =A0struct pglist_data {
> > =A0 =A0 =A0 =A0.....
> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0struct kswapd =A0 kswapd;
> > =A0 =A0 =A0 =A0};
> > and you can add a macro as
> >
> > #define kswapd_waitqueue(kswapd) =A0 =A0 =A0 =A0(&(kswapd)-&g= t;kswapd_wait)
> > if it looks better.
> >
> > Why I recommend this is I think it's better to have 'stru= ct kswapd'
> > on the same page of pg_data_t or struct memcg.
> > Do you have benefits to kmalloc() struct kswapd on damand ?
> >
>
> So we don't end of have kswapd struct on memcgs' which doesn&#= 39;t have
> per-memcg kswapd enabled. I don't see one is strongly better than = the other
> for the two approaches. If ok, I would like to keep as it is for this<= br> > verion. Hope this is ok for now.
>

My intension is to remove kswapd_spinlock. Can we remove it wit= h
dynamic allocation ? IOW, static allocation still requires spinlock ?

Thank you for pointing that out which made m= e thinking a little harder on this. I don't think we need the spinlock<= /div>
in this patch.

This is something I=A0inherite= d=A0from another kswapd patch we did where we allow one kswapd to reclaim f= rom multiple pgdat. We need the spinlock there since we need to protect the= pgdat list per kswapd. However, we have one-to-one
mapping here and we can get rid of the lock. I will remove it on the n= ext post.

--Ying

Thanks,
-Kame



--000e0cd68ee03dbcbf04a0fbfb95-- -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: email@kvack.org